3.18 Chapter 1
Histogram (柱状图)
一、Key Concepts 几个概念
1.Population (总数):a population is the group of all items of interest to a statistics practitioner.
2.Sample (样本) :A sample is a set of data drawn from the population.
[Part of a population]
3.Parameter(参数): A descriptive measure of apopulation.
4.Statistic: A descriptive measure of a sample.
二、Statistics 分类
1.Descriptive Statistics(描述性统计):organizing, summarizing,and presenting data.
Include: Graphical Techniques&Numerical Techniques
2.Inferential Statistics (推断统计):draw conclusionsor inferences about characteristics of populations based on data from a sample.
三、Significance level(显著性水平)和Confidence level (置信水平)
1.The confidence level is the proportion of times that an interval estimate for a population parameter will be correct.“ 1–α” torepresent the confidence level when we wish to estimate a population parameter.
2.Significance level measures how frequently a “true claim” is accidently rejected. Use α(Greek letter “alpha”) tore present the significance level when testing a claim about a population parameter .
https://www.zhihu.com/question/23149768?utm_campaign=rss&utm_medium=rss&utm_source=rss&utm_content=title
Chapter 2 Numerical DescriptiveTechniques数值描述方法
第一部分
A variable(变量)[Typicallycalled a “random” variable since we do not know it’s value until we observe it]is some characteristic of a population or sample.
The values of the variable are the range of possible values for avariable.
一、Two types of Data
1.Numerical/Quantitative Data [Real Numbers] -定量数据 Include Inteval data & Ratio data. 包括定距数据和定比数据
a.Continuous Data – Data can be any real number within a given range. 连续性数据
b.Discrete Data – Data can only be very specific 离散型数据
values which we can list.
2.Qualitative/Categorical Data [Labels rather than numbers]-定性数据
包括定类数据和定序数据
a.Frequency
distribution:the data in a tablethat presents the categories and their counts
b.Relative frequency distribution: liststhe categories and the proportion with which each occurs.
c.Pareto chart(帕累托表) : Since Nominal data has no order, if we
arrange the outcomes from the most frequently occurring to the least frequently occurring
d.Cross-sectional data: Observations
measured at the same point in time
e.Time-series data: Observations measured at successive points in time
计量经济学的两大研究对象:横截面数据(Cross-sectional
Data)和时间序列数据(Time-series Data)。前者旨在归纳不同经济行为者是否具有相似的行为关联性,以模型参数估计结果显现相关性;后者重点在分析同一经济行为者不同时间的资料,以展现研究对象的动态行为。
Line Cart (线性图)
第二部分
Numerical Descriptive Techniques...
1.Measures of Central Location(中量数):Mean(平均数), Median(中间的数), Mode(众数)
2.Measures of Variability(差异量数):Range(范围), Standard Deviation(标准差), Variance(方差), Coefficient of Variation(变异系数)
3.Measures of Relative Standing(相对位置量数) Percentiles(百分位数), Quartiles (四分位数)
4.Measures of Linear Relationship Covariance(线性关系协方差测度
)Correlation(相关系数), Least Squares Line(最小二乘直线)
arithmetic mean 算数平均数
The Empirical Rule(经验法则): 经验规则是统计规律,指出了在正态分布,几乎所有数据都将落在均值的三倍标准差内。所述经验规则表明,68%的数据将分布在的第一个标准偏差之内,95%,和99.7%将落在均值的前三个标准偏差之内。
Box Plots...(箱体图):箱体图是在1977年由美国的统计学家约翰·图基(John Tukey)发明的。它由五个数值点组成:最小值(min),下四分位数(Q1),中位数(median),上四分位数(Q3),最大值(max)。
http://www.blogjava.net/norvid/articles/317235.html