Chapter 2

Line Listing is equivalent to a spread sheet. Row variables are called observations or records aka the patient or case. Instead of making line listings in excel CDC has a published and free software, EPI Info, which is used. To make a line listing in EPI Info click create survey and enter in all the data.

There are 4 variable types, seen in the columns of line listings. First, the nominal-scale-variable. This variable is all about placing a observation or record in or out of a group, gender, vaccination status, etc are examples. Second, the ordinal-scale-variable. This variable is grouping using the ranking system, where each rank is not an equal step above the other, an example of this is stage of cancer. Third, the interval-scale-variable. This is like the ordinal variable but has intervals that are evenly spaced apart, an example of this would be date of birth. The fourth, ratio-scaled-variables. This variable is interval like, but it has a definite zero, so the magnitude is more more important, an example of this is time of illness.

If a line listing is a spread sheet of all observations/records matched with their their individual data information. Than a frequency distributions are tables of a variable outcomes matched with the number of corresponding records/observations. Doing so reduces the size of the data and concentrates the information which makes understanding of the bigger pictures easier. To make a frequency distribution in EPI Info click frequency than variable.

The central location/central tendency of a frequency distributions is the area where most of the data points exist at. The spread of a frequency distribution is described using range and standard deviations. Skewed is a measurement of a frequency distribution curve's shape. It is the measurement of the tail not the hump so left skewed means the tail is on the left side. The normal distribution or the normal bell curve is also called the Gaussian distribution and is when the mean, median and mode are equal.

Don't forget about mode although it appears as insignificant when compared to mean, it is not. Think about a vaccination clinic looking at data showing the days people attended, they would want to know which day people come most often. Or what is the most common amount of DPT vaccine children have in your community. Average is not important here, it is mode. In EPI Info to find the mode, just create a histogram and see which bar is highest.

In EPI Info, to find medium (which is also the 50th percentile) and mean, click the means tab, in the means of click the variable of interest, than ok. The medium is used when dealing with skewed data.

The centering property of the mean is interesting. Take the mean and subtract it from all values. Add all these numbers up and you get zero.

The midrange is an interesting tool as well. You take the lowest observation add it to the highest observation and divide by 2. It is interesting because when calculating midrange for age, you ad the lowest observation and highest observation and 1 before you divide by 2 to compensate for the days in between the years.

The geometric mean is different. To calculate you have to multiply all the values together then take the root of n where n equals the number of observations. Or you have to take the log of everything, get the mean, than compute the antilog of that mean. This is used only for data that shows a logarithmic pattern such as 1, .1, .01, .001, .0001, etc. This number is always less than an arthritic mean and is used for data points that differ greatly from each other or have exponents.

The Undergrad Edition - Epidemiology

Friday, October 4, 2013

Chapter 2

No comments:

Post a Comment