Chapter 2 Random variables - inferential statistics.

Exercise 2.1 Supposing \(Z\) is a standard normal variable, find the probabilities (with three decimals) that

  1. \(Z \geq 2\)
  2. \(Z \leq 3\)
  3. \(Z \geq 3.93\)

Exercise 2.2 Suppose a factory produces some commodity, and to produce one unit of the commodity, they use 3 units of raw material A, and 4 units of material B. If future cost of the materials are uncertain, they can be modeled by random variables \(X\) and \(Y\). The total raw material cost per unit is then a new variable, \[ W = 3X + 4Y\;.\] Assume \[ \mu_x = 200, \ \mu_y = 90, \ \sigma_x = 20, \ \sigma_y = 10\;. \] Also assume \(X\) and \(Y\) have normal distributions.

Consider the following questions 1 and 2:

  1. What is the mean and standard deviation of the total raw material cost?
  2. What is the 5% worst-case scenario value for the cost? (I.e. determine a value \(b\) such that the cost \(W\) is above \(b\) with probability 0.05.)

Answer question 1 and 2 in each case a - c below.

  1. The correlation between \(X\) and \(Y\) is 0.90
  2. \(X\) and \(Y\) are uncorrelated
  3. The correlation is \(-0.40\).

Summarize what you find regarding the risk of high raw material cost as the correlation shifts from strong positive to zero and further to negative. Try to explain in simple terms why this happens.

Exercise 2.3 Suppose \(X, Y\) are two variables with expected value 10 and standard deviation 2. We assume the variables are uncorrelated. Find the expected value and standard deviation for the following

  • \(W_1 = X + Y\)
  • \(W_2 = 2X\)
  • \(W_3 = X - Y\)
  • \(W_4 = 5 + X\)
  • \(W_5 = 5 + X + Y\)

Exercise 2.4 As part of an analysis of the estate market (market for houses and flats) a researcher is interested in the number of days a flat is on the market before it is sold. Let \(\mu\) be the current mean number of days, and let \(p\) be the proportion of flats that were sold within five business days. In a sample of 400 flats, she finds sample mean \(\bar{x}= 17\) and sample standard deviation \(S=5\). She also finds that 100 of the flats in the sample were sold within five days.

  1. Find a 95% confidence interval for \(\mu\) based on the sample data.
  2. Find a 95% confidence interval for \(p\) based on the sample data.
  3. In a) you probably found the interval on the form \(\bar{x} \pm ME\) where \(ME\) is the margin of error. Explain what would happen to the margin of error if (i) We have a larger sample and 95% confidence level. (ii) We have same sample size, but want a 99% confidence interval instead of 95%.
  4. The researcher observes that a few flats are in the market for more than 100 days without being sold. What alternative (to the sample mean) empirical measure would you use for describing “typical number of days to sale” for a flat. In what way could the sample mean be misleading?

Exercise 2.5 This exercise uses data from a random sample of 1600 used cars in Norway. It is about CO\(_2\) emissions from the cars, which is usually measured in grammes/kilometer (g/km). We assume the sample is representative for the whole population of used cars in Norway. In the sample, the mean CO\(_2\) emission was found to be \(\bar{x} = 164\) g/km. The sample standard deviation was \(s = 35\). Also it was found that 200 of the 1600 observed cars had emissions above 250 g/km. In the following, let \(\mu\) denote the population mean emissions of all cars, and let \(p\) be the population proportion of cars with emissions above 250 g/km.

  1. Find a 95% confidence interval for \(\mu\) based on the sample data.
  2. Suppose the government has a goal of getting the mean emission \(\mu\) from cars below 160 g/km. Is it likely that this goal is already reached? (Argue by the confidence interval for \(\mu\). If you do not have the answer in a), explain how you would have used the interval).
  3. Find a 95% confidence interval for \(p\).
  4. In the confidence interval above, you use a margin of error (ME). Decide whether the following statements are true or false.
  1. If the sample size was increased to 2000, with the same sample proportion observed, the ME would be smaller.
  2. If the sample size was 1600, but the confidence level was raised to 99%, the ME would be smaller.

Exercise 2.6 Suppose the weekly sales (in liters) of a brand of milk in a supermarket can be described by a normal random variable with mean \(\mu = 900\) and standard deviation \(\sigma = 100\). Suppose the supermarket’s inventory of the milk can only be increased once a week by reordering.

  1. Find the probability that less than 1050 liters are sold in a given week.
  2. The store manager wants to find an inventory level \(b\) so that a stock-out (means the inventory is empty) will occur only in 1 out of 20 (i.e 5%) of the weeks. Determine the level \(b\).
  3. Assume inventory cost for the milk is NOK 200 pr. week. The supermarket earns NOK 1.10 pr. liter sold. The weekly profit from selling the milk is then the revenues from sold milk minus inventory cost. Find the expected value and standard deviation for the weekly profit.

Exercise 2.7 The transportation company OOPS is (among many other things) collecting pallets of empty bottles from supermarkets to bring back to a mineral water producer. Suppose the total time it takes to load a pallet onto a truck at a supermarket can be modeled by a normal random variable with mean 6 minutes and standard deviation 1.5 minutes.

  1. Give an interval of times \([a, b]\) such that the actual loading time at a supermarket will be in the interval with approximately 95% certainty. Describe briefly the rule or method you use.
  2. Find the probability that the loading time is less than 7.5 minutes.

Another succesful delivery from OOPS!