Public Health Forum

Would you like to react to this message? Create an account in a few clicks or log in to continue.
Public Health Forum

A Forum to discuss Public Health Issues in Pakistan

Welcome to the most comprehensive portal on Community Medicine/ Public Health in Pakistan. This website contains content rich information for Medical Students, Post Graduates in Public Health, Researchers and Fellows in Public Health, and encompasses all super specialties of Public Health. The site is maintained by Dr Nayyar R. Kazmi

Latest topics

» Polio Endemic Countries on the Globe
Bioststistic-Definitions EmptySat Apr 08, 2023 8:31 am by Dr Abdul Aziz Awan

» Video for our MPH colleagues. Must watch
Bioststistic-Definitions EmptySun Aug 07, 2022 11:56 pm by The Saint

» Salam
Bioststistic-Definitions EmptySun Jan 31, 2021 7:40 am by mr dentist

» Feeling Sad
Bioststistic-Definitions EmptyTue Feb 04, 2020 8:27 pm by mr dentist

» Look here. Its 2020 and this is what we found
Bioststistic-Definitions EmptyMon Jan 27, 2020 7:23 am by izzatullah

» Sad News
Bioststistic-Definitions EmptyFri Jan 11, 2019 6:17 am by ameen

» Pakistan Demographic Profile 2018
Bioststistic-Definitions EmptyFri May 18, 2018 9:42 am by Dr Abdul Aziz Awan

» Good evening all fellows
Bioststistic-Definitions EmptyWed Apr 25, 2018 10:16 am by Dr Abdul Aziz Awan

» Urdu Poetry
Bioststistic-Definitions EmptySat Apr 04, 2015 12:28 pm by Dr Abdul Aziz Awan

Navigation

Affiliates

Statistics

Our users have posted a total of 8425 messages in 1135 subjects

We have 439 registered users

The newest registered user is Dr. Arshad Nadeem Awan


    Bioststistic-Definitions

    Dr Abdul Aziz Awan
    Dr Abdul Aziz Awan


    Pisces Number of posts : 685
    Age : 56
    Location : WHO Country Office Islamabad
    Job : National Coordinator for Polio Surveillance
    Registration date : 2007-02-23

    Bioststistic-Definitions Empty Bioststistic-Definitions

    Post by Dr Abdul Aziz Awan Tue Jul 08, 2008 1:58 pm

    Measures of Central Tendency




    What is the most common, most typical, or
    most often-occurring value of a variable? The following chart shows which
    measures of central tendency can be used with variables measured at the
    nominal, ordinal, or interval/ratio level.





    Nominal

    Ordinal

    Interval/Ratio

    Mode

    X

    X

    X

    Median



    X

    X

    Mean





    X





    Mode




    The mode, or modal value, is the most
    commonly occurring value or category of a variable. To find the mode, look for
    the category that contains the highest number of observations. For example, in
    a survey of 88 cities, the most common form of city governance is the
    council/manager form.



    Type of Government

    Number of Cities

    Commission

    4

    Weak Mayor

    17

    Strong Mayor

    22

    Council/Manager

    45

    Total

    88

    Note, however, that a variable may have two modal
    categories. For example, if the type of government had looked like this,



    Type of Government

    Number of Cities

    Commission

    14

    Weak Mayor

    18

    Strong Mayor

    28

    Council/Manager

    28

    Total

    88

    Then the variable would have two modal categories,
    "Strong Mayor" and "Council/Manager". This means that there
    is no one central tendency within the data for this variable. (More will
    be said about this under "Skew" below under "Measures of
    Dispersion.")



    Median




    The median is the value of the category
    or case that divides an ordered distribution into two equal parts. One half of
    the values will be higher than the median value; the other half of the values
    will be lower than the median value. To find the median, you must first put all
    the observations in order, from lowest to highest. Then use the formula
    (N+1)/2.

    For example, if there are 7 categories of employee pay,
    the median category will be category number 4, or (7+1)/2=4. In the example
    below, the median pay category is $24,000. The value of this category can also
    be interpreted as the median pay value.



    Pay:

    $12,000

    $17,000

    $18,000

    $24,000

    $25,000

    $27,000

    $30,000

    However, if there are 8 categories of employee pay, the
    median pay value will fall in between two categories. The median category is
    category 4.5, or (8+1)/2=9/2=4.5 Add the fourth and the fifth categories
    and divide by two, or ($24,000+$25,000)/2=$49,000/2=$24,500.



    Pay

    $12,000

    $17,000

    $18,000

    $24,000

    $25,000

    $27,000

    $30,000

    $58,000

    If you have grouped data, that is, ranges of values, as
    well as the number of people found in each group or range, there is a more
    precise way to calculate the median. For example, assume that the pay
    categories are ranges of pay, and employees are distributed among them as
    follows:



    Pay
    Range

    Number of
    employees

    Cumulative
    Frequency

    $20,000-
    $29,000

    9

    9

    $30,000-
    $39,000

    14

    23

    $40,000-
    $49,000

    16

    39

    $50,000-
    $59,000

    21

    60

    Total

    60

    -

    The median pay is found by using the formula for grouped
    data, N/2. In this case, there are 60 employees, so the median = 60/2=the 30th
    observation.

    We can see from the cumulative distribution that the 30th
    observation will be found in the category of $40,000-$49,000. We can estimate
    the median by calculating the mid-point of this category, by adding the lower
    boundary value to the higher boundary value and dividing by two, or
    ($40,000+$49,000)/2=$89,000/2=$44,500.

    To calculate the median more exactly, we can look at how
    many observations into the $40,000-$49,000 category we must go to find the 30th
    observation.

    There are 16 observations in this category. We must go to
    the 7th observation to find the 30th total observation of the sample. So we can
    calculate the value of going 7/16th of the way through this category.

    The category has 10 values
    (40,41,42,43,44,45,46,47,48,49). So 7/16 x 10 = 4.375.

    We add this to the lower boundary value of the category
    to get the median salary value of $40,000+$4,375=$44,375.

    Note that we must assume that the observations are evenly
    distributed within the categories; if the sample size is large enough in
    relation to the number of categories, this is usually not a problem.
    Dr Abdul Aziz Awan
    Dr Abdul Aziz Awan


    Pisces Number of posts : 685
    Age : 56
    Location : WHO Country Office Islamabad
    Job : National Coordinator for Polio Surveillance
    Registration date : 2007-02-23

    Bioststistic-Definitions Empty Re: Bioststistic-Definitions

    Post by Dr Abdul Aziz Awan Tue Jul 08, 2008 1:59 pm

    Mean




    The mean, or average, is the arithmetic
    balance point of the distribution, but is not the same as the median or the
    mode. If you subtract from the mean each observation in the sample that is
    above the mean, that sum will be equal to the sum of subtracting each
    observation in the sample that is below the mean.



    To find the mean


    1) add up the value of all the
    observations in the sample, and


    2) divide that sum by the total
    number of observations.

    The average of the following employee salaries is equal
    to $21,857.14



    Salaries:

    $12,000

    $17,000

    $18,000

    $24,000

    $25,000

    $27,000

    $30,000

    However, the average of the following salaries is equal
    to $26,375.



    Salaries

    12,000

    17,000

    18,000

    24,000

    25,000

    27,000

    30,000

    58,000

    This demonstrates the fact that the value of the mean is
    sensitive to very high, or very low values. In this case, it may be better to
    use the median.



    To find the mean from grouped data:






    1) first find the midpoint for each
    range or category. This is found by adding the lower boundary value to
    the upper boundary value and dividing by two.


    2) Then multiply the number of
    employees in each range by the midpoint of the range.


    3) Add up all the products of
    (number of employees times range midpoint)


    4) Divide that sum by the total
    number of observations.












    Pay
    Range

    Number of
    employees

    Range Midpoint

    Product

    $20,000-
    $29,000

    9

    $24,500

    $220,500

    $30,000-
    $39,000

    14

    $34,500

    $438,000

    $40,000-
    $49,000

    16

    $44,500

    $712,000

    $50,000-
    $59,000

    21

    $54,500

    $1,144,500

    Total

    60

    -

    $2,560,000

    In this case, $2,560,000/60=$42,666.67



    Measures of Dispersion




    Measures of dispersion are the opposite
    of measures of central tendency. The former attempt to describe the most
    typical or central value of a distribution of values of a variable. Measure of
    dispersion, in contrast, attempt to give an idea of how widely dispersed the
    values are and how different the observations are from one another.

    The following chart shows which measures of variation and dispersion be used
    with variables measured at the nominal, ordinal, or interval/ratio level.





    Nominal

    Ordinal

    Interval/Ratio

    Range



    X

    X

    Percentiles



    X

    X

    Standard Deviation





    X

    Variance





    X




    Range




    The range is the difference between the
    highest and the lowest values in an ordered distribution of the values of a
    variable. For example, if the highest paid employee makes $58,000 per year and
    the lowest paid employee makes $12,000 per year, the salary range is $46,000.
    (Note the average is $26,400)



    Pay:

    12,000

    17,000

    18,000

    24,000

    25,000

    27,000

    30,000

    58,000

    However, if the highest paid employees makes $29,000 per
    year and the lowest paid employee makes $22,500 per year, the salary range is
    $3,000. (Note the average is $26,000)



    Pay:

    22,500

    24,500

    25,000

    26,000

    27,000

    28,000

    28,500

    29,000

    Although these two organizations
    have very similar averages, they have very different ranges. For which
    organization would you rather be working?




    Dr Abdul Aziz Awan
    Dr Abdul Aziz Awan


    Pisces Number of posts : 685
    Age : 56
    Location : WHO Country Office Islamabad
    Job : National Coordinator for Polio Surveillance
    Registration date : 2007-02-23

    Bioststistic-Definitions Empty Re: Bioststistic-Definitions

    Post by Dr Abdul Aziz Awan Tue Jul 08, 2008 1:59 pm

    Percentiles




    In general, percentiles are points in the
    distribution of the ordered values of a variable at which a known number of
    observations fall below the point and a known number of observations remain
    above the point.

    For example, the 50th percentile is the same as the
    median; at the 50th percentile, half of the observations have higher values and
    half of the observations have lower values.

    Percentiles are often used on standardized tests, such as
    the SAT or GRE. If you scored at the 75th percentile, that means that 75% of
    the other people scored below your score and 25% scored at or above your score.


    Sometimes on tests for civil service, applicants are
    advised that they must score at a certain percentile or above to be considered
    for an interview, a promotion, etc.

    When two organizations have very different ranges but
    similar averages, you may want to use the interquartile range, or the range
    between the 25th and 75th percentiles. The interquartile range contains the
    middle 50% of the observations.



    To arrive at the interquartile range,






    1) ignore the bottom 25% of the
    categories and the top 25% of the categories.


    2) re-calculate the range,
    subtracting the new bottom category from the new top category.

    For example, in this case, there are 8 categories, so
    one-quarter of 8 categories=2 categories. Ignoring the top two and the bottom
    two categories, the interquartile range for this organization is
    $27,000-$18,000=$9,000.



    Pay:

    12,000

    17,000

    18,000

    24,000

    25,000

    27,000

    30,000

    58,000

    The interquartile range for this organization is
    $28,000-$25,000=$3,000.



    Pay:

    22,500

    24,500

    25,000

    26,000

    27,000

    28,000

    28,500

    29,000







    Standard Deviation




    The standard deviation is a measure of
    the average difference of each observation in a distribution from the average
    (mean) of the distribution. Given that you can calculate a mean, how different
    are most of the observations from that mean?

    If most of the observations are near the mean in value,
    the standard deviation will be small. But if most of the observations are far
    from the mean in value, the standard deviation will be large.



    The formula for calculating the standard deviation is






    1) calculate the mean


    2) subtract the value of each
    observation from the value of the mean


    3) square the difference obtained
    in step 2 for each observation


    4) add up all the squared
    differences obtained in step 3


    5) divide the sum of the squared
    differences by the total number of observations


    6) take the square root of the
    result of step 5=the value of the standard deviation






    Variance




    The variance is an expression of the
    total amount of variability of the observations for a variable. The value of
    the variance is obtained by squaring the value of the standard deviation.

    A variable with a large variance has a great deal of
    difference in the values of the various observations, while a variable with a
    small variance has less difference in the values of the various observations.



    Skew




    The skew refers to the "shape"
    of the distribution of the values of a variable. The values of a variable can
    be plotted on a chart.

    If the values of the observations are distributed
    symmetrically around the mean of a variable, that is called a normal
    distribution. In this case, the mean, median, and mode will all coincide.

    If the values of most of the observations are lower than
    the value of the mean, then the distribution is called a negatively skewed or
    left skewed distribution. In this case, the mode will have a lower value than
    the median, and the mean will have a higher value than the median.

    If the values of most of the observations are higher than
    the value of the mean, then the distribution is called a positively skewed or
    right skewed distribution. In this case, the mode will have a higher value than
    the median, and the mean will have a lower value than the median.

    An inspection of the skewness of a variable will help the
    researcher to decide which of the three measures of central tendency to
    use--mean, median, or mode--as the best indicator of the central tendency of
    the distribution of values for that variable.



    Normal Curve




    The Normal Curve is a graph of the values
    of a variable where those values are distributed symmetrically about the mean
    of the variable. It has the following characteristics:

    1) it has a bell-shaped, symmetrical curve

    2) the mean, median, and mode all have the same value

    3) the properties of the curve are known

    4) it is useful in calculating estimates in inferential statistics

    If the distribution of the values on a variable approach
    a normal curve, we know that approximately 68% of the values will be within
    plus or minus one standard deviation from the mean; 95% of the values will be
    within plus or minus two standard deviations from the mean; and 99% of the
    values will be within plus or minus three standard deviations from the mean.

    This is useful because the value of any one observation
    can be converted to a standardized score, or z-score. A standardized score or
    z-score converts any observation to a measure of standard deviation units,
    where the value of the mean equals zero and the value of a standard deviation
    equals one.



    The formula for the z-score is






    1) calculate the mean of the
    variable


    2) calculate the standard deviation
    of the variable


    3) subtract the value of the
    observation from the value of the mean


    4) divide the result of step 3 by
    the standard deviation of the variable=z-score

    A z-score of +1.5 means that the value of the observation
    lies 1.5 standard deviation units above the mean. A z-score of -2.0 means that
    the value of the observation lies 2.0 standard deviation units below the mean.

    If a student scores 40 out of 50 on a test of mathematics
    (with a mean of 41 and a standard deviation of 5), and 60 out of 75 on a test
    of language (with a mean of 55 and a standard deviation of 10), the scores are
    not directly comparable. Converting each of them to their respective z-scores
    allows them to be compared directly.

    z-score for 45= (45-40)/5=-0.2

    z-score for 53=(53-40)/15=+0.5

    Although the student scored 80% of the total points
    available on each test, the student did slightly better than average (+0.5
    standard deviations) on the language test, and slightly worse than average
    (-0.2 standard deviations) on the math test.
    Dr Abdul Aziz Awan
    Dr Abdul Aziz Awan


    Pisces Number of posts : 685
    Age : 56
    Location : WHO Country Office Islamabad
    Job : National Coordinator for Polio Surveillance
    Registration date : 2007-02-23

    Bioststistic-Definitions Empty Re: Bioststistic-Definitions

    Post by Dr Abdul Aziz Awan Tue Jul 08, 2008 2:01 pm

    Dr Abdul Aziz Awan
    Dr Abdul Aziz Awan


    Pisces Number of posts : 685
    Age : 56
    Location : WHO Country Office Islamabad
    Job : National Coordinator for Polio Surveillance
    Registration date : 2007-02-23

    Bioststistic-Definitions Empty Re: Bioststistic-Definitions

    Post by Dr Abdul Aziz Awan Wed Jul 09, 2008 8:39 am


    Sponsored content


    Bioststistic-Definitions Empty Re: Bioststistic-Definitions

    Post by Sponsored content


      Current date/time is Wed Oct 16, 2024 1:39 pm