In the context of this exercise on determining the necessary data during the data science process, what are common values used for margin of error and confidence level?
Depending on how accurate you wish the data to be, you can essentially choose whatever values you want for margin of error and confidence. However, there is always a tradeoff, as more accurate data requires more work due to a larger sample size needed.
The margin of error determines how much the true value is from what we obtained in our survey. As you increase the margin of error, you will need a smaller sample size, but it will be less accurate, so smaller values are better. Usually, a value of around 5% is used for the margin of error.
The confidence level is used to give the probability that the margin of error contains the true proportion of the population. The larger this value, the more accurate, and “confident”, we can be that the data is closer to the actual population’s. The most common value used for confidence level is 95%. However, 90% and 99% are also commonly used.
wouldn’t that mean that the confidence interval is a function of the margin of error? How are they related/different?
One would think that the confidence interval would depend on the sample or population size, not margin of error on any given sample. Without a sufficient population to begin with, margin of error is impossible to compute.
hI! i just want to ask that how could we calculate the sample size?
i think in the exercise the first question should read that the population size is 200 not the sample size…otherwise how can you calculate the sample size when it’s already given to you?
A population size is the whole shebang, whereas a sample size is a subset of the population sufficient in number to arrive at a near estimate, all other factors remaining the same.
I think the question is well posed, they actually want you to increase the population size until you find the number that gives you a sample size of 200, which is 416 (I wrote an entire explanation on how to obtain this value from the sample size formula, but appareantly these comment fields don’t support latex format, sorry).
is there anyway you could break down the process of how you achieved the value of 416? I understand the main idea but still confused of how you came up with 416 value. Thanks
the whole formula of the calculator would be nice to see. 416 u can get just matching sample size to 200. but second part 5000 isnt possible 384 is top number u can get by increasing population.
I found a way, here goes the link:
Sample size equation clearing N
And the image, in case you don’t trust the link:
As core6620398233 pointed out, after a sample size of 384 the formula stops working, it starts to give you negative values for N, given that a sample size of 384 is the biggest you can get.
Sorry for so much editing, just in case you don’t know:
With the values given in the excercise, you actually reach a sample size of 384 with a population size of 222,640, any population size smaller than that gives you a smaller sample size. with a population of 5,000 the sample size is 357.
Awesome, this definitely helped me. Thank you for taking the time to respond.
thanks Dave.huge help as well
I remember most of these terms and actions from high school AP statistics.This breakdown of margin of error and confidence level is true. I think one thing I can gather from all of this is it all depends on how confident your hypothesis. If the margin of error is increased then your confidence level surely decreases along with sample size. My only question is during the research can you change the confidence level?
Well according to question, the not accurate answer should be from 414-418 of population size consider using the provided calculator. I think for most optimistic number should be around 416 -417 consider not using the actual calculation method.
but 414, gives the same 200 value, right?
It does if you round it up, the real value is 199.511… All the population values from 414 to 418 round (up or down) to a sample size of 200.
I’m also curious about the relationship between margin fo error and confidence level. Are they correlated to each other or they’re independent?
@jephos249, you are using the terms ‘accurate’ and ‘confidence’ interchangeably. Could you please explain or justify this? As a layperson, I understood these two terms to have very different meanings? In the context of your post would the term ‘precision’ be a better fit than the term ‘accuracy’? Thanks.
Never. Neither are a good fit for presenting reality. Best we can do is couch it as fit for our purposes.
Accuracy is a term than relies upon precision, and precision relies upon a degree which no matter what is going to involve rounding. Where do we best round? That’s a rhetorical question.
Precision in our speak might refer simply to the number of digits, or decimal places, as apply. Significant Figures, or SigDigs would follow this convention. Our precision is limited by the inputs. 3.14 cannot be given more digits, even while PI may be represented with an infinite number of decimal places.
How useful is, ‘accurate to within a tenth of a degree’ when we’re speaking of objects 10 billion light years away? Or ever 100 million light years? Or 2.5 million light years (Andromeda) away? A tenth of a degree is larger than the size of a galaxy at even a short distance on that scale. Accuracy is wasted as a goal. SigDigs are the key to this discussion. Focus on them.