# Report for Anu’s restaurant

This report gives analysis results about a questionnaires conducted for Anu’s high fashion restaurant. All questions in the assignment are answered in the same order of the assignment.

## Question1:Descriptive analysis for every question

There are 30 questions in the questionnaires aimed to portrait potential customers’ characteristics and preference. Several different scales are used to measure the variables represent the answers of each questionnaires, such as nominal, ordinal, interval and ratio. To report the central tendency and variations for each variable, we use different methods, too.

When dealing with nominal variables, we use mode as the central tendency, which is the largest group and also gives its proportion, and the second largest group’s proportion as a represent of variations, due to the fact that there is no consensus about the dispersion of nominal variable.

For ordinal variables, medium is used to measure the variables central tendency, and again, we use the second largest group’s proportion as the representative  of dispersion when the medium is the largest, and the largest group’s proportion when medium is not.For the interval variables, mean and standard variances are the representatives of central tendency and dispersion. The questions can be classified into three groups, the personal fair, characteristics and preference.

The questions about personal fairs are summaries in Tab 1.

Tab1: questions about personal fairs

The questions about characteristics are listed in Tab 2.

Tab 2: questions about characteristics

The questions about the preference are listed in Tab 3.

Tab 3: questions about the preference

## Question2:Some population estimations

In this section, we estimate some proportions and central tendency of the questions.

1.        Preference for “easy listening” radio programming
The sample proportion of this group is 18.8%.
H0: the proportion of “easy listening” in the whole population is 18.6%,
H1:the proportion of “easy listening” in the whole population is not18.8%.
To conduct a binomial test, the P value is 0.5, thus we cannot reject the null hypothesis, and conclude that the proportion of “easy listening” in the whole population is 18.8%.

Tab 4 Binomial test for easy radio

2.        Viewing the 10 p.m. local news on TV
The sample proportion of the viewing 10 p.m. news is 49%.
H0: the proportion for viewing 10 p.m. news is49%.
H1:the proportion for viewing 10 p.m. news isnot 49%.
The P value of the binomial test is 0.476, so we cannot reject the null hypothesis, and conclude that the proportion for viewing 10 p.m. news is49%.

Tab 5 Binominal test for local news

3.        Average age of the heads of household
The average age of the heads of household is 45.63.
H0: the means of the heads of household in the whole population is 45.63.
H1: the means of the heads of household in the whole populationis not 45.63.
The p value is 1, which shows that there is not enough evidence to reject the null hypothesis, as Tab 6 reveals. So we conclude that the means of the heads of household in the whole population is 45.63.

Tab 6: One sample test for average age of household.

4.        Average price paid for an evening meal entrée
The average price paid for an evening meal entrée in the sample is 18.85.
H0: the population’s mean of the expected price for an evening meal entrée is18.85.
H1: the population’s mean of the expected price for an evening meal entrée is not18.85.
Since the p value is 0.994, there is not enough evidence to reject the null hypothesis, so we can claim that the population’s mean of the expected price for an evening meal entrée is18.85.

Tab 7: One sample test for average price for an entree

## Question3:High Income proportion

Since Anu hopes that proportion of income over \$100,000 is larger than 30%, we test this.
H0: the proportion of income over \$100,000 in the whole populationis less than 30%.
H1:the proportion of income over \$100,000 in the whole populationismore than 30%.
In the sample, this proportion is 27.5%. The p value of the binomial test is 0.16, which is larger than the usual significant level 0.05, thus there is not enough evidence to reject the null hypothesis. Anu’s hope about the proportion of income over \$100,000 is larger than 30%may not be right.

Tab 8: the income test

## Question4:Décor and style

Anu believes that the potential patrons are fond of tuxedos, elegance, and jazz. Here we test these guesses.

For tuxedos,
H0: among potential patrons, preferring tuxedos accounts 50%.
H1: among potential patrons, preferring tuxedosdoes not account 50%.
The p value of this test is 0, which indicts that there is enough evidence to reject the null hypothesis. Since the while sample of “strongly likely to patronize” are all like tuxedos, thus, we can infer this proportion is larger than 50%.

For elegant décor,
H0: among highly potential patrons, preferring elegant décor accounts 50%.
H1: among highly potential patrons, preferring elegant décor does not account 50%.
The situation is quite similar with tuxedos; the p value is 0, and all samples like elegant décor, so we can infer this proportion is larger than 50%.

For jazz combo,
H0: among highly potential patrons, preferring jazz combo accounts 50%.
H1: among highly potential patrons, preferring jazz combo does not account 50%.
The situation is different. The p value is 0, but in the sample, people like jazz combo only accounts for 19%, thus, we can infer that among highly potential patrons only a majority who like jazz combo.

Tab 9: the binominal test for tuxedos, jazz and elegant décor.

## Qustion5:Gender difference

Anu assumes that this high fashion restaurant is more pearling to women than men. Therefor we form the null hypothesis.
H0:there is no significant difference of the mean in the likelihood of patronize for women and men.
H1: there is a significant difference of the mean in the likelihood of patronize for women and men.
To conduct a 2 sample test, the p value is 0.325, so there is not enough evidence to reject the null. This judgment shows that women and men’s likelihood distribution may be the same.

Tab 10: The test for gender difference

## Question6:Location

The choice between a location of waterfront view and a location within 30 minutes car drive is quit tough.
H0: there’s no significant difference between themean of the preferences for waterfront view and the mean of the preference for 30 minutes car drive.
H1: there’ssignificant difference between themean of the preferences for waterfront view and the mean of the preference for 30 minutes car drive.
We conduct a Wilcoxon signed rank test, and the p value is 0, as showed in the Tab 7. This shows that there is enough evidence to reject the null hypothesis. Since the positive rank “prefer waterfront view – prefer drive less than 30 minutes” is larger than negative rank, we have the conclusion that waterfront view is more welcome.

Tab 11: Wilcoxon signed rank test about the location

## Qustion7:Relation between income and restaurant style

Whether the general knowledge that low income people are less appealing to high fashion restaurant then high income people suits the situation of Anu’s restaurant needs careful check. In Tab 8, we build a cross table for the variable of likelihood of patronize and variable of income.

Tab 12: a cross table for the variable of likelihood to patronize and variable of income.

H0:The variable of income is independent from the variable of likelihood to patronize;
H1: The variable of income isnot independent from the variable of likelihood topatronize;
We conduct a chi square test, and the p value is 0. This shows that there is enough evidence to reject the null hypothesis. From the cross table, we can see that there is a positive relationship with likelihood of patronizing and income.

Tab 13: chi square test

## Qustion8:Age and unusual menu

The difference between young people and aged people may be reflected by their choice of menu. We group the age into 6 groups.

For the unusual desert,
H0: the two variables age and preference of unusual desert are distributed independently.
H1: the two variables age and preference of unusual desert are notdistributed independently.
The cross table are listed below (Table 14).The P value is 0 which means that we can reject the null hypothesis. From the cross table, we can see that older people do not like unusual dessert.

Tab 14: Cross table for age and unusual dessert.

For the unusual entrée, we conduct the same analysis.
H0:the two variables age and preference for unusual entréeare distributed independently.
H1:the two variables age and preference for unusual entrée are not distributed independently.
The p value for the test is 0, and the conclusion is that older people do not prefer unusual entrée.

## Question9:Demographic profile of potential patrons

For people who are the potential patrons, they are some demographic characteristics among them relation income, gender, zip code and education level. So there are totally 6 cross table for those 6 variables, while there is only 1 significant relationship.

H0: the two variables the income and zip codeare distributed independently.
H1: the two variables the income and zip codeare not distributed independently.
The p value is 0, which indicates that there is enough evidence to reject the null hypothesis.This shows that potential patrons are largely lived in B area and have higher income then C,

Tab 15: crosstab of education and zip code

For the media habit of potential patrons,the relationships between TV, radio, newspaper and magazine may give Anu a clue of advertisement. But among the 6 relationships, there are only 1 have significant statistical sense.

Tab 16 states that people who subscribe to city magazine are likely to watch 6:00 p.m. news.
H0: the two variables the preference for magazine and newsare distributed independently.
H1: the two variables the preference for magazine and newsare not distributed independently.
And the p value is 0.024, which indicates that when the significant level is 0.05, we can reject the null hypothesis. This gives a clue of potential patrons’ habit between magazine and newscast.

Tab 16: crosstab of newscast and magazine.

## Question10 Regression analysis

To determine the relation between the likelihood of patronize Anu’s restaurant and the characteristic of the restaurant, may need multiple regression analysis. Besides the 10 ordinal type variables that represent the preference of patrons for the restaurant, the interval type variable that represents the expected price for an evening entrée can be seen as the characteristic of a restaurant. The aim of the multiple regression is to use this 11 variables to formulate an equation that express the quantitative relations between a restaurant’s characteristic and the likelihood to patronize. While nearly all the dependent variables are ordinal type, not to mention the independent variable the likelihood to patronize is in the same type, the sample size is relativebig, therefore we conduct a linear regression.

These 11 variables are not on the same page when judged by the power to explain the independent variable, so we use a stepwise procedure to choice the best out. Finally we pick the variable “expected price for evening entree” and “prefer for tuxedos” to come into our model.

Tab 17: regression variables entered

The final model has a R square 0.727, which reveals that the model have some power to explain the variance of the dependent variable though the two independent variables. The variable “expected price of evening entree” alone can explain 72% of the variance of the dependent variable, since the first model only contains this independent variable and it’s R square is 0.712. We can say that the variable “expected price of evening entree” is the most powerful variable in this regression. And the two independent variables,“expected price of evening entree” and “preference for tuxedos” can explain 73% of the total variance of the dependent variable, we think this two variable best explain the variance of the dependent variable.

Tab 18: R square for regression

The ANOVA provide the details of the variance of the model and the residual.
H0: there is no significant linear relation between the independent variables and dependent variables.
H1: there is no significant linear relation between the independent variables and dependent variables.
Since the p value is 0, we have enough evidence to reject the null hypothesis, and conclude that our linear regression model has statistical sense.

Tab 19: ANOVA for regression

The following table provides the details of the coefficient.
H0: the coefficient for variable “expected price of evening entree” is zero;
H1:the coefficient for variable “expected price of evening entree” is significantly different from zero;
The p value is 0, thus we have enough evident to reject the null hypothesis, and conclude the coefficient for “expected price of evening entree” is 0.072.
The situation is very similar with the variable “preference for tuxedos”.
H0: the coefficient for variable “preference for tuxedos” is zero;
H1:the coefficient for variable “preference for tuxedos” is significantly different from zero;
The p value is 0, thus we have enough evident to reject the null hypothesis, and conclude the coefficient for “preference for tuxedos” is 0.133.

Tab 20: coefficients for regression

The assumption of linear regression is that the residual has a normal distribution, and the following histogram provides the frequency plot of the residual, and we can see there’s a little distance between it and normal distribution.

Fig1 the histogram for residual

The QQ plot can further confirm that the residual may abbey other distribution. The discontinuity of plot may be the result of too many ordinal type variables.

Fig 2 the QQ plot for regression

The above picture treats the residuals as a whole, now we look at them with the company of the dependent variable. We can see clearly that the residual are not symmetrically distribute around 0 for each level of dependent variable and the centers for each levels have a nonlinear relationship with dependent variable, which indicates that there may be a nonlinear model.

Fig 3 the residual plot for each level of dependent variable

Despite the above analysis about residual, this linear regression model generally capture the main source of dependent variable’s variance, that is the two variables, “expected price of evening entree” and “preference for tuxedos”. For a more concise model, nonlinearity is on the top list to consider.

tag：

Contact us / 聯系我們

QQ： 273427
QQ： 273427

Online Service / 在線客服

Hours / 服務時間

Copyright ? 2008-2018 assignment代寫

Badgeniuscs

Badgeniuscs