Sampling Distribution of the Sample Proportion, p-hat (2024)

  1. Last updated
  2. Save as PDF
  • Page ID
    31308
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\)

    \( \newcommand{\vectorC}[1]{\textbf{#1}}\)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}}\)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}}\)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    CO-6: Apply basic concepts of probability, random variation, and commonly used statistical probability distributions.

    Behavior of Sample Proportions

    Learning Objectives

    LO 6.21: Apply the sampling distribution of the sample proportion (when appropriate). In particular, be able to identify unusual samples from a given population.

    EXAMPLE 6: Behavior of Sample Proportions

    Approximately 60% of all part-time college students in the United States are female. (In other words, the population proportion of females among part-time college students is p = 0.6.) What would you expect to see in terms of the behavior of a sample proportion of females (p-hat) if random samples of size 100 were taken from the population of all part-time college students?

    As we saw before, due to sampling variability, sample proportion in random samples of size 100 will take numerical values which vary according to the laws of chance: in other words, sample proportion is a random variable. To summarize the behavior of any random variable, we focus on three features of its distribution: the center, the spread, and the shape.

    Based only on our intuition, we would expect the following:

    Center: Some sample proportions will be on the low side — say, 0.55 or 0.58 — while others will be on the high side — say, 0.61 or 0.66. It is reasonable to expect all the sample proportions in repeated random samples to average out to the underlying population proportion, 0.6. In other words, the mean of the distribution of p-hat should be p.

    Spread: For samples of 100, we would expect sample proportions of females not to stray too far from the population proportion 0.6. Sample proportions lower than 0.5 or higher than 0.7 would be rather surprising. On the other hand, if we were only taking samples of size 10, we would not be at all surprised by a sample proportion of females even as low as 4/10 = 0.4, or as high as 8/10 = 0.8. Thus, sample size plays a role in the spread of the distribution of sample proportion: there should be less spread for larger samples, more spread for smaller samples.

    Shape: Sample proportions closest to 0.6 would be most common, and sample proportions far from 0.6 in either direction would be progressively less likely. In other words, the shape of the distribution of sample proportion should bulge in the middle and taper at the ends: it should be somewhat normal.

    Comment:

    • The distribution of the values of the sample proportions (p-hat) in repeated samples (of the same size) is called the sampling distribution of p-hat.

    The purpose of the next video and activity is to check whether our intuition about the center, spread and shape of the sampling distribution of p-hat was correct via simulations.

    Video

    Video: Simulation #1 (p-hat) (4:13)

    Did I Get This?: Simulation #1 (p-hat)

    At this point, we have a good sense of what happens as we take random samples from a population. Our simulation suggests that our initial intuition about the shape and center of the sampling distribution is correct. If the population has a proportion of p, then random samples of the same size drawn from the population will have sample proportions close to p. More specifically, the distribution of sample proportions will have a mean of p.

    We also observed that for this situation, the sample proportions are approximately normal. We will see later that this is not always the case. But if sample proportions are normally distributed, then the distribution is centered at p.

    Now we want to use simulation to help us think more about the variability we expect to see in the sample proportions. Our intuition tells us that larger samples will better approximate the population, so we might expect less variability in large samples.

    In the next walk-through we will use simulations to investigate this idea. After that walk-through, we will tie these ideas to more formal theory.

    Video

    Video: Simulation #2 (p-hat) (4:55)

    Did I Get This?: Simulation #2 (p-hat)

    The simulations reinforced what makes sense to our intuition. Larger random samples will better approximate the population proportion. When the sample size is large, sample proportions will be closer to p. In other words, the sampling distribution for large samples has less variability. Advanced probability theory confirms our observations and gives a more precise way to describe the standard deviation of the sample proportions. This is described next.

    The Sampling Distribution of the Sample Proportion

    If repeated random samples of a given size n are taken from a population of values for a categorical variable, where the proportion in the category of interest is p, then the mean of all sample proportions (p-hat) is the population proportion (p).

    As for the spread of all sample proportions, theory dictates the behavior much more precisely than saying that there is less spread for larger samples. In fact, the standard deviation of all sample proportions is directly related to the sample size, n as indicated below.

    Sampling Distribution of the Sample Proportion, p-hat (1)

    Since the sample size n appears in the denominator of the square root, the standard deviation does decrease as sample size increases. Finally, the shape of the distribution of p-hat will be approximately normal as long as the sample size n is large enough. The convention is to require both np and n(1 – p) to be at least 10.

    We can summarize all of the above by the following:

    Sampling Distribution of the Sample Proportion, p-hat (2)

    Let’s apply this result to our example and see how it compares with our simulation.

    In our example, n = 25 (sample size) and p = 0.6. Note that np = 15 ≥ 10 and n(1 – p) = 10 ≥ 10. Therefore we can conclude that p-hat is approximately a normal distribution with mean p = 0.6 and standard deviation

    Sampling Distribution of the Sample Proportion, p-hat (3)

    (which is very close to what we saw in our simulation).

    Comment:

    • These results are similar to those for binomial random variables (X) discussed previously. Be careful not to confuse the results for the mean and standard deviation of X with those of p-hat.

    Learn by Doing: Sampling Distribution of p-hat

    Did I Get This?: Sampling Distribution of p-hat

    If a sampling distribution is normally shaped, then we can apply the Standard Deviation Rule and use z-scores to determine probabilities. Let’s look at some examples.

    EXAMPLE 7: Using the Sample Distribution of p-hat

    A random sample of 100 students is taken from the population of all part-time students in the United States, for which the overall proportion of females is 0.6.

    (a) There is a 95% chance that the sample proportion (p-hat) falls between what two values?

    First note that the distribution of p-hat has mean p = 0.6, standard deviation

    \(\sigma_{\hat{p}}=\sqrt{\dfrac{p(1-p)}{n}}=\sqrt{\dfrac{0.6(1-0.6)}{100}}=0.05\)

    and a shape that is close to normal, since np = 100(0.6) = 60 and n(1 – p) = 100(0.4) = 40 are both greater than 10. The Standard Deviation Rule applies: the probability is approximately 0.95 that p-hat falls within 2 standard deviations of the mean, that is, between 0.6 – 2(0.05) and 0.6 + 2(0.05). There is roughly a 95% chance that p-hat falls in the interval (0.5, 0.7) for samples of this size.

    (b) What is the probability that sample proportion p-hat is less than or equal to 0.56?

    To find

    \(P(\hat{p} \leq 0.56)\)

    we standardize 0.56 into a z-score by subtracting the mean and dividing the result by the standard deviation. Then we can find the probability using the standard normal calculator or table.

    \(P(\hat{p} \leq 0.56)=P\left(Z \leq \dfrac{0.56-0.6}{0.05}\right)=P(Z \leq-0.80)=0.2119\)

    To see the impact of the sample size on these probability calculations, consider the following variation of our example.

    EXAMPLE 8: Using the Sample Distribution of p-hat

    A random sample of 2500 students is taken from the population of all part-time students in the United States, for which the overall proportion of females is 0.6.

    (a) There is a 95% chance that the sample proportion (p-hat) falls between what two values?

    First note that the distribution of p-hat has mean p = 0.6, standard deviation

    \(\sigma_{\hat{p}}=\sqrt{\dfrac{p(1-p)}{n}}=\sqrt{\dfrac{0.6(1-0.6)}{2500}}=0.01\)

    and a shape that is close to normal, since np = 2500(0.6) = 1500 and n(1 – p) = 2500(0.4) = 1000 are both greater than 10. The Standard Deviation Rule applies: the probability is approximately 0.95 that p-hat falls within 2 standard deviations of the mean, that is, between 0.6 – 2(0.01) and 0.6 + 2(0.01). There is roughly a 95% chance that p-hat falls in the interval (0.58, 0.62) for samples of this size.

    (b) What is the probability that sample proportion p-hat is less than or equal to 0.56?

    To find

    \(P(\hat{p} \leq 0.56)\)

    we standardize 0.56 to into a z-score by subtracting the mean and dividing the result by the standard deviation. Then we can find the probability using the standard normal calculator or table.

    \(P(\hat{p} \leq 0.56)=P\left(Z \leq \dfrac{0.56-0.6}{0.01}\right)=P(Z \leq-4) \approx 0\)

    Comment:

    • As long as the sample is truly random, the distribution of p-hat is centered at p, no matter what size sample has been taken. Larger samples have less spread. Specifically, when we multiplied the sample size by 25, increasing it from 100 to 2,500, the standard deviation was reduced to 1/5 of the original standard deviation. Sample proportion strays less from population proportion 0.6 when the sample is larger: it tends to fall anywhere between 0.5 and 0.7 for samples of size 100, whereas it tends to fall between 0.58 and 0.62 for samples of size 2,500. It is not so improbable to take a value as low as 0.56 for samples of 100 (probability is more than 20%) but it is almost impossible to take a value as low as 0.56 for samples of 2,500 (probability is virtually zero).

    Applet: Sampling Distribution for a Sample Proportion

    Sampling Distribution of the Sample Proportion, p-hat (2024)
    Top Articles
    10 Best Bloxburg House Ideas [2022]- 1, 2 & 3 Story Mansion
    Bloxburg House Ideas - Cheap, Mansions, & Modern Houses!
    Sdn Md 2023-2024
    Cappacuolo Pronunciation
    CLI Book 3: Cisco Secure Firewall ASA VPN CLI Configuration Guide, 9.22 - General VPN Parameters [Cisco Secure Firewall ASA]
    Ret Paladin Phase 2 Bis Wotlk
    2024 Fantasy Baseball: Week 10 trade values chart and rest-of-season rankings for H2H and Rotisserie leagues
    Ati Capstone Orientation Video Quiz
    Localfedex.com
    Cube Combination Wiki Roblox
    Otr Cross Reference
    FIX: Spacebar, Enter, or Backspace Not Working
    Transformers Movie Wiki
    Meritas Health Patient Portal
    Becu Turbotax Discount Code
    Houses and Apartments For Rent in Maastricht
    Define Percosivism
    Icommerce Agent
    Paychex Pricing And Fees (2024 Guide)
    2024 INFINITI Q50 Specs, Trims, Dimensions & Prices
    The Tower and Major Arcana Tarot Combinations: What They Mean - Eclectic Witchcraft
    Little Rock Skipthegames
    Hannaford To-Go: Grocery Curbside Pickup
    Ceramic tiles vs vitrified tiles: Which one should you choose? - Building And Interiors
    Breckiehill Shower Cucumber
    Foodsmart Jonesboro Ar Weekly Ad
    Top 20 scariest Roblox games
    This Is How We Roll (Remix) - Florida Georgia Line, Jason Derulo, Luke Bryan - NhacCuaTui
    Barbie Showtimes Near Lucas Cinemas Albertville
    Noaa Marine Forecast Florida By Zone
    Math Minor Umn
    Frommer's Belgium, Holland and Luxembourg (Frommer's Complete Guides) - PDF Free Download
    Boondock Eddie's Menu
    Myhrconnect Kp
    Muma Eric Rice San Mateo
    THE 10 BEST Yoga Retreats in Konstanz for September 2024
    Oreillys Federal And Evans
    How to play Yahoo Fantasy Football | Yahoo Help - SLN24152
    Lake Kingdom Moon 31
    Dinar Detectives Cracking the Code of the Iraqi Dinar Market
    Subdomain Finder
    Vintage Stock Edmond Ok
    Stosh's Kolaches Photos
    Haunted Mansion (2023) | Rotten Tomatoes
    Meet Robert Oppenheimer, the destroyer of worlds
    Page 5747 – Christianity Today
    The 5 Types of Intimacy Every Healthy Relationship Needs | All Points North
    Fallout 76 Fox Locations
    Uncle Pete's Wheeling Wv Menu
    211475039
    Shad Base Elevator
    7 National Titles Forum
    Latest Posts
    Article information

    Author: Madonna Wisozk

    Last Updated:

    Views: 5735

    Rating: 4.8 / 5 (48 voted)

    Reviews: 95% of readers found this page helpful

    Author information

    Name: Madonna Wisozk

    Birthday: 2001-02-23

    Address: 656 Gerhold Summit, Sidneyberg, FL 78179-2512

    Phone: +6742282696652

    Job: Customer Banking Liaison

    Hobby: Flower arranging, Yo-yoing, Tai chi, Rowing, Macrame, Urban exploration, Knife making

    Introduction: My name is Madonna Wisozk, I am a attractive, healthy, thoughtful, faithful, open, vivacious, zany person who loves writing and wants to share my knowledge and understanding with you.