Marginal Pdf Of X And Y
File Name: marginal of x and y.zip
Subscribe to RSS
Sheldon H. Stein, all rights reserved. This text may be freely shared among individuals, but it may not be republished in any medium without express written consent from the authors and advance notification of the editor. Abstract Three basic theorems concerning expected values and variances of sums and products of random variables play an important role in mathematical statistics and its applications in education, business, the social sciences, and the natural sciences.
A solid understanding of these theorems requires that students be familiar with the proofs of these theorems. But while students who major in mathematics and other technical fields should have no difficulties coping with these proofs, students who major in education, business, and the social sciences often find it difficult to follow these proofs.
In many textbooks and courses in statistics which are geared to the latter group, mathematical proofs are sometimes omitted because students find the mathematics too confusing. In this paper, we present a simpler approach to these proofs. This paper will be useful for those who teach students whose level of mathematical maturity does not include a solid grasp of differential calculus.
Introduction The following three theorems play an important role in all disciplines which make use of mathematical statistics: Let X and Y be two jointly distributed random variables.
In mathematical statistics, one relies on these theorems to derive estimators and to examine their properties. But textbooks and courses differ with regard to the extent to which they cover these theorems and their proofs. The best coverage is found in texts like Hogg and Tannis and Mendenhall, Scheaffer, and Wackerly Also, in some fields, like finance and econometrics, the subject matter relies rather heavily on these theorems and their proofs.
But other fields seem to take a more casual approach to the mathematical foundations of statistical theory. That, however, is required only for the proofs of the formulae and not of their use.
The availability of large amounts of data and good computational tools is what is required to make sense of statistics. For example, some of my students have argued with me in class, when the textbook was not immediately available, that Theorems 1 and 2 do require that X and Y be statistically independent and that Theorem 3 is always true. So without understanding the proofs, can one really understand their content and the statistical concepts that they support?
But even students who must master the content of these theorems are often uncomfortable with their formal proofs. The proofs of these three theorems make use of double summation notation in the case of discrete random variables. I suspect that a large part of the problem that students have with the proofs is a result of the fact many of them suffer from what might be termed "double summation notation anxiety.
While the simplifications involve some loss of generality, I believe on the basis of my personal experience that the tradeoff is well worth it. Joint Probability Distributions To understand Theorems 1, 2, and 3 , one must first understand what is meant by a joint probability distribution. While some textbooks, written for mathematics and statistics majors e.
Hogg and Tannis and Mendenhall, Scheaffer, and Wackerly as well as other majors like Mirer illustrate the concept of a joint probability distribution by using an example for the discrete case, many e. Becker , Kmenta and Newbold do not, which makes it more difficult for the student to grasp the material. Consider two random variables X and Y distributed as in Table 1 : Table 1.
P x i and P y j are also called marginal probabilities because they appear on the margins of the table. The sum of all of the joint probabilities p x i ,y j is of course equal to unity, as is the sum of all of the marginal probabilities of X and the sum of all of the marginal probabilities of Y. The proofs of Theorems 2 and 3 follow a similar format See Kmenta My experience has been that most students do not feel comfortable with these proofs.
And in many textbooks, the proof is presented without a joint probability table which compounds the difficulty for students. In either case, many students will memorize Theorems 1, 2, and 3 and learn their proofs if necessary. Theorem 1: An Alternative Proof I have found that most of my students can understand the proof of Theorem 1 and the other theorems if it is presented in the following manner.
Let us assume that random variables X and Y take on just two values each, as in Table 2 : Table 2. In other words, we multiply each joint probability by its X value and then sum. Many students have an easier time with this mode of presentation than the equivalent one expressed using double summation notation. This proof is straightforward, intuitive, and does not require the use of double or even single summation signs.
If one wants to make the proof a little more general, one can allow for a third value of one of the random variables. Once the student understands the simple proof above, it is, hopefully, a simple step forward to the more general proof. Theorem 2 The proofs of Theorems 2 and 3 can be constructed using the same approach. At this point, the assumption of statistical independence of X and Y is utilized.
This proof is much simpler than the general proof that is found in statistics textbooks. Numerical Examples The following numerical examples of the theorems is very useful in clarifying the material discussed above. Three cases will be presented-one in which the covariance between two random variables is positive, another in which the covariance is negative, and a third in which two random variables are statistically independent, which gives rise to a zero covariance.
Consider an experiment where two fair coins are tossed in the air simultaneously. Consider Table 3 below where the random variable X represents the number of heads in an outcome and Y represents the number of tails: Table 3. Table 4. The expected value of X and also of Y is 0 0. The variance of X and also of Y is 2 0. Notice that the simple probability distributions from Table 4 are the same as the marginal probability distributions of Table 5.
The covariance of X and Y [i. Note also that Theorems 1 and 2 find support in this example even though random variables X and Y are not statistically independent since the joint probabilities are not equal to the product of their corresponding marginal probabilities. But Theorem 3 does not hold as E XY is equal to 0. Let us now consider Table 6 in which the random variables R and S are defined in a different way.
Random variable R is defined purely on the basis of the toss of the first coin and random variable S is defined purely on the basis of the second coin toss. If the outcome of the first coin toss is heads, we assign random variable R the value 1 while if the first coin toss is a tails, we assign R the value 0 without regard to the outcome of the second toss. Similarly, random variable S is defined solely on the basis of the toss of the second coin, without regard to the outcome of the first coin toss.
If the outcome of the second coin toss is heads, we assign random variable S the value 1 while if the second coin toss is a tails, we assign S the value 0. Hence, random variables R and S are statistically independent. Table 6. Table 7. From Table 6 , we also derive Table 9 which presents the joint probability distribution table of random variables R and S. The covariance of R and S is zero, which is a consequence of the fact that R and S are statistically independent. We know this because each joint probability is equal to the product of the corresponding marginal probabilities.
Since the expected values of R and S are each 0. Table 9. In Table 9 , it is obvious that this is the case. Finally, we construct an example where random variables are defined in a coin flipping experiment where the covariance between the two random variables are positive. Consider Table 10 below. The generation of random variable X was presented in Table 3. Random variable W is defined by assigning a value of 2 if both coins turn up heads, -2 if both coins turn up tails, and zero if one coin turns up heads and the other turns up tails.
Table Its mean is equal to 0 Table Probability Distribution of W W Probability -2 0. The joint probability distribution of X and W is found in Table 13 : Table But because X and W are not statistically independent, which we can see by comparing the joint probabilities with the products of the concomitant marginal probabilities, Theorem 3 need not apply, and it does not. Conclusion In teaching a course in money and financial markets , I went over the proofs of Theorems 1, 2, and 3 in the manner presented above.
While I did it in order to facilitate the understanding of the application of those theorems to the study of risk, diversification, and the capital asset pricing model, my students reported that it also helped them understand the discussion of those theorems in an econometrics course that they were taking simultaneously.
Having taught that econometrics course in prior years, I remember the tension in the classroom when I went over the proofs of the three theorems above in the conventional manner with the double summation notation.
Unless one understands these theorems and their proofs, a course in econometrics becomes an exercise in rote memorization rather than in understanding. The same can be said for other courses using these theorems. Thus, the techniques outlined in this paper are useful whenever the objective of a course is not only to teach students how to do statistics, but also to understand statistics. Acknowledgement I would like to give thanks to two anonymous referees for useful comments and suggestions and to my graduate students at Cleveland State University for providing the inspiration to write this paper.
References Becker, W. Hogg, R. Kmenta, J. Mansfield, E. Mendenhall, W. Mirer, T. Newbold, P. Stein csuohio.
Even math majors often need a refresher before going into a finance program. This book combines probability, statistics, linear algebra, and multivariable calculus with a view toward finance. You can see how linear algebra will start emerging The marginal probability mass functions are what we get by looking at only one random variable and letting the other roam free. You can think of these as collapsing back to single-variable probability.
In probability theory and statistics , the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. It gives the probabilities of various values of the variables in the subset without reference to the values of the other variables. This contrasts with a conditional distribution , which gives the probabilities contingent upon the values of the other variables. Marginal variables are those variables in the subset of variables being retained. These concepts are "marginal" because they can be found by summing values in a table along rows or columns, and writing the sum in the margins of the table. The context here is that the theoretical studies being undertaken, or the data analysis being done, involves a wider set of random variables but that attention is being limited to a reduced number of those variables.
First consider the case when.
5.2: Joint Distributions of Continuous Random Variables
Sign in. Github : the corresponding Python notebook can be found here. We have learned what is a random variable, a probability mass function or a probability density function.
Consider a random vector whose entries are continuous random variables , called a continuous random vector. When taken alone, one of the entries of the random vector has a univariate probability distribution that can be described by its probability density function. This is called marginal probability density function, in order to distinguish it from the joint probability density function , which instead describes the multivariate distribution of all the entries of the random vector taken together.
Generally, the variance for a joint distribution function of random variables X and Y is given by:. The standard deviation of joint random variables is the square root of the variance. Therefore, the standard deviation is given by:. To determine the variance and standard deviation of each random variable that forms part of a multivariate distribution, we first determine their marginal distribution functions and compute the variance and the standard deviation, just like in the univariate case.
For the most part, however, we are going to be looking at moments about the mean, also called central moments. This handling also extends to situations where we have more than to variables. Expected values can easily be found from marginal distributions. You have been given the following joint pmf.
Sign in. Github : the corresponding Python notebook can be found here. We have learned what is a random variable, a probability mass function or a probability density function. The goal was also to gain more intuition for very used tools like derivatives, the area under the curve and integrals. Then, we will see the concept of conditional probability and the difference between dependent and independent events.