Peer Discussion Response
|Perfect Number of Pages to Order
Peer Discussion Response
1.When it pertains to formal statistical inference, the only time it is permitted to use t-test are when all of the populations on the one that is drawing from are usually normally distributed, where you compute the standard deviation from the sample given. A various amount of statistics will be computed in each quarter each of the 2 regions, then the interest would determine whether or not there are statistical significant differences among both regions, on each of the aforementioned measures. In this case scenario, if we are unsure if the populations are normal themselves, then we should rely on the central limit theorem and not use a t-test. When we decide to use a t- based test or when to use a chi-square, the latter is primarily used when one has a cell count, meaning how much of a service, product, or something has been observed. Anything that is a mean of a non-discrete variable will likely be used for a t- squared test, that would include, but are not limited to :
This does not include what should be done with a chi-square test, which include, but are not limited to:
The difference occurs when there are not enough data points, whose values are larger than most of the values. If there are a few claims that are expensive, then the median may be lower than the mean, because when we compute the median, it results in having small values. Since the claims data are not normally distributed, the median, along with the mean for a normal distribution is the same, we can expect to see similar or alike (variables for the) values of both the median and the mean. But, since the mean is greater than the median, then it is most likely a positive skewed distribution. With this type of distribution, there is a great number of values, that are below the mean,in addition to a few values that are above. An average claim must be computed, by using the median, due to few and high values, do not affect this particular measure. In general , if we happen to use the mean, then we are able to find a number for the population and not a representative of the population.
James Cochran, Jeffrey Camm, David Anderson, Dennis Sweeney, Thomas Williams Modern Business Statistics 6th Edition
You are a manager working for an insurance company. Your job entails processing individual claims filed by policyholders. In general, most claims are relatively minor, cost wise, but a few are quite expensive. Each quarter, you compile a report summarizing key claims statistics that includes the number of claims submitted, the mean cost per claim, the median cost per claim, the proportion of claims being litigated, the number of emergency procedures, the proportion of men versus women, and the average age of claimants. Your measures are computed separately for the southern and northern regions, and you are interested in determining whether or not there are statistically significant differences between the two regions on each of the aforementioned measures.
Evaluate which comparisons would require the use of the t-test and which would use the chi-squared test.
Explain your answers. Support your discussion with relevant examples, research, and rationale.
The mean cost per claim versus the median cost per claim. Let us say, for example, that the mean cost is greater than the median cost based on the few claims that are noticeably expensive. From what we have learned from our previous weekly discussions, the median will be my preferred measure simply because when there is a large range of values being included in a sample, the bigger will be the risk or possibility of its impact to our population distribution. As Sweeney, & Williams (2018) wrote, these values can be called outliers because these are the data points that are misfits within the reflected trend by the remaining data and must therefore be corrected.
According to Parapar, Losada, Presedo, & Barreiro (2020), T-test can only be used when there are “data drawn from specific distributions.” Based on their study, the authors concluded that “T-test assumes that its results follow a normal distribution with an assumption that the mean of the distribution of differences is zero. But with our given situation, no guarantee was provided that our population from the southern and northern regions are normally distributed.” Therefore, T-test will not be a reliable measure.
With reference to our textbook by Anderson, Sweeney, & Williams (2018), we can use the proportion of men versus women as one variable, and the average age of claimants as another variable. Using chi-square test of independence, we can determine whether these two variables that were randomly sampled from southern region and randomly sampled from northern region populations are independent from each other. Assuming that these variables are independent, we can take one random sample of population from the southern region population and take note of the values with respect to our two categorical variables – the gender and age of the claimants. We will call these data as expected values. We subtract these values from the observed data, square it, and then divide the result with the expected values. If we say that these variables are independent, then we can conclude that the proportion of men and women are valid, and so is the average age of claimant. We will do the same for the northern region population. We can see if there are more female claimants or male claimants. We can also compare whether a certain age group files more claims than any other age group without regard to gender between the two regions. Additionally, we can also summarize our data by a combination of these two categorical variables – can we see more female claimants who fall under 50 years old than male claimants of the same age group?
One example I can think of is to decide whether mom’s decision to buy a certain brand of milk product from the grocery stores is related to the cost of the product. If we can establish that moms buy milk because of a particular brand that they believe in and not because of the cost, then we can certainly conclude that these two variables are independent of each other. With this example, we can therefore use the chi-square test.
Peer Discussion Response