radar_sun

15-R-DataCamp-Introduction-to-Data-in-R

1. Languages of Data

1.1 Welcome to the course (video)

1.2 Loading data into R

Instruction 1:

# Load data
data(email50)

# View the structure of the data
str(email50)

1.3 Types of variables (video)

1.4 Identify variable types

Instruction:

# Glimpse email50
glimpse(email50)

1.5 Categorical data in R: factors (video)

1.6 Filtering based on a factor

Instruction:

# Subset of emails with big numbers: email50_big
email50_big <- email50 %>%
  filter(number == "big")

# Glimpse the subset
glimpse(email50_big)

1.7 Complete filtering based on a factor

Instruction:

# Subset of emails with big numbers: email50_big
email50_big <- email50 %>%
  filter(number == "big")

# Table of the number variable
table(email50_big$number)

# Drop levels
email50_big$number_dropped <- droplevels(email50_big$number)

# Table of the number_dropped variable
table(email50_big$number_dropped)

1.8 Discretize a variable (video)

1.9 Discretize a different variable

Instruction:

# Calculate median number of characters: med_num_char
med_num_char <- median(email50$num_char)

# Create num_char_cat variable in email50
email50_fortified <- email50 %>%
  mutate(num_char_cat = ifelse(num_char < med_num_char, "below median", "at or above median"))
  
# Count emails in each category
email50_fortified %>%
  count(num_char_cat)

1.10 Combining levels of a different factor

Instruction:

# Create number_yn column in email50
email50_fortified <- email50 %>%
  mutate(
    number_yn = case_when(
      # if number is "none", make number_yn "no"
      number == "none" ~ "no", 
      # if number is not "none", make number_yn "yes"
      number != "none" ~ "yes"  
    )
  )
  
# Visualize the distribution of number_yn
ggplot(email50_fortified, aes(x = number_yn)) +
  geom_bar()

1.11 Visualizing numerical data (video)

1.12 Visualizing numerical and categorical data

Instruction:

# Load ggplot2
library(ggplot2)

# Scatterplot of exclaim_mess vs. num_char
ggplot(email50, aes(x = num_char, y = exclaim_mess, color = factor(spam))) +
  geom_point()

2. Study Types and Cautionary Tales

2.1 Observational studies and experiments (video)

2.2 Identify type of study: Reading speed and font

2.3 Identify type of study: Countries

Instruction:

# Load data
data(gapminder)

# Glimpse data
glimpse(gapminder)

# Identify type of study: observational or experimental
type_of_study <- "observational"

2.4 Random sampling and random assignment (video)

2.5 Random sampling or random assignment?

2.6 Identify the scope of inference of study

2.7 Simpsons paradox (video)

2.8 Number of males and females admitted

Instruction:

# Load packages
library(dplyr)

# Count number of male and female applicants admitted
ucb_admit %>%
  count(Gender, Admit)

2.9 Proportion of males admitted overall

Instruction:

ucb_admission_counts %>%
  # Group by gender
  group_by(Gender) %>%
  # Create new variable
  mutate(prop = n / sum(n)) %>%
  # Filter for admitted
  filter(Admit == "Admitted")

2.10 Proportion of males admitted for each department

Instruction 1:

ucb_admission_counts <- ucb_admit %>%
  # Counts by department, then gender, then admission status
  count(Dept, Gender, Admit)

# See the result
ucb_admission_counts

Instruction 2:

ucb_admission_counts  %>%
  # Group by department, then gender
  group_by(Dept, Gender) %>%
  # Create new variable
  mutate(prop = n / sum(n)) %>%
  # Filter for male and admitted
  filter(Gender == "Male", Admit == "Admitted")

2.11 Admission rates for males across departments

2.12 Recap: Simpson’s paradox (video)

Identify type of study: Countries [new]

3. Sampling Strategies and Experimental Design

3.1 Sampling strategies (video)

3.2 Sampling strategies, determine which

3.3 Sampling strategies, choose worst

3.4 Sampling in R (video)

3.5 Simple random sample in R

Instruction:

# Simple random sample: states_srs
states_srs <- us_regions %>%
  sample_n(size = 8)

# Count states by region
states_srs %>%
  count(region)

3.6 Stratified sample in R

Instruction:

# Stratified sample
states_str <- us_regions %>%
  group_by(region) %>%
  sample_n(2)

# Count states by region
states_str %>%
  count(region)

3.7 Compare SRS vs stratified sample

3.8 Principles of experimental design (video)

3.9 Identifying components of a study

3.10 Experimental design terminology

3.11 Connect blocking and stratifying

4. Case study

4.1 Beauty in the classroom (video)

4.2 Inspect the data

Instruction:

# Inspect evals
glimpse(evals)

4.3 Identify type of study

4.4 Sampling/ experimental attributes

4.5 Variables in the data (video)

4.6 Identify variable types

Instruction:

# Inspect variable types
glimpse(evals)

# Remove non-factor variables from the vector below
cat_vars <- c("rank", "ethnicity", "gender", "language", "cls_level", "cls_profs", "cls_credits","pic_outfit", "pic_color")

4.7 Recode a variable

Instruction:

# Recode cls_students as cls_type
evals_fortified <- evals %>%
  mutate(
    cls_type = case_when(
      cls_students <=18  ~ "small",
      cls_students <=59 ~ "midsize",
      cls_students >=60 ~ "large"
    )
  )

4.8 Create a scatterplot

Instruction:

# Scatterplot of score vs. bty_avg
ggplot(evals, aes(x = bty_avg, y = score)) +
  geom_point()

4.9 Create a scatterplot, with an added layer

Instruction:

# Scatterplot of score vs. bty_avg colored by cls_type
ggplot(evals, aes(x = bty_avg, y = score, color = cls_type)) +
  geom_point()

4.10 Congratulations

15-R-DataCamp-Introduction-to-Data-in-R radar_sun
15-R-DataCamp-Introduction-to-Data-in-R1.LanguagesofData1.1Welcometothecourse(video)1.2LoadingdataintoRInstruction1:#Loaddatadata(email50)#Viewthestructureofthedatastr(email50)1.3Typesofvariables(vide
[黑洞与暗粒子]没有光的世界 comsci
无论是相对论还是其它现代物理学,都显然有个缺陷,那就是必须有光才能够计算但是,我相信,在我们的世界和宇宙平面中,肯定存在没有光的世界.... 那么,在没有光的世界,光子和其它粒子的规律无法被应用和考察,那么以光速为核心的 &nbs
jQuery Lazy Load 图片延迟加载 aijuans jquery
基于 jQuery 的图片延迟加载插件，在用户滚动页面到图片之后才进行加载。对于有较多的图片的网页，使用图片延迟加载，能有效的提高页面加载速度。版本： jQuery v1.4.4+ jQuery Lazy Load v1.7.2 注意事项：需要真正实现图片延迟加载，必须将真实图片地址写在 data-original 属性中。若 src
使用Jodd的优点 Kai_Ge jodd
1. 简化和统一 controller ，抛弃 extends SimpleFormController ，统一使用 implements Controller 的方式。 2. 简化 JSP 页面的 bind, 不需要一个字段一个字段的绑定。 3. 对 bean 没有任何要求，可以使用任意的 bean 做为 formBean。使用方法简介
jpa Query转hibernate Query 120153216 Hibernate
public List<Map> getMapList(String hql, Map map) { org.hibernate.Query jpaQuery = entityManager.createQuery(hql); if (null != map) { for (String parameter : map.keySet()) { jp
Django_Python3添加MySQL/MariaDB支持 2002wmj mariaDB
现状首先，[email protected] 中默认的引擎为 django.db.backends.mysql 。但是在Python3中如果这样写的话，会发现 django.db.backends.mysql 依赖 MySQLdb[5] ，而 MySQLdb 又不兼容 Python3 于是要找一种新的方式来继续使用MySQL。 MySQL官方的方案首先据MySQL文档[3]说，自从MySQL
在SQLSERVER中查找消耗IO最多的SQL 357029540 SQL Server
返回做IO数目最多的50条语句以及它们的执行计划。 select top 50 (total_logical_reads/execution_count) as avg_logical_reads, (total_logical_writes/execution_count) as avg_logical_writes, (tot
spring UnChecked 异常官方定义！ 7454103 spring
如果你接触过spring的事物管理！那么你必须明白 spring的非捕获异常！即 unchecked 异常！因为 spring 默认这类异常事物自动回滚！！ public static boolean isCheckedException(Throwable ex) { return !(ex instanceof RuntimeExcep
mongoDB 入门指南、示例 adminjun java mongodb 操作
一、准备工作 1、下载mongoDB 下载地址：http://www.mongodb.org/downloads 选择合适你的版本相关文档：http://www.mongodb.org/display/DOCS/Tutorial 2、安装mongoDB A、不解压模式：将下载下来的mongoDB-xxx.zip打开，找到bin目录，运行mongod.exe就可以启动服务，默
CUDA 5 Release Candidate Now Available aijuans CUDA
The CUDA 5 Release Candidate is now available at http://developer.nvidia.com/<wbr></wbr>cuda/cuda-pre-production. Now applicable to a broader set of algorithms, CUDA 5 has advanced fe
Essential Studio for WinRT网格控件测评 Axiba JavaScript html5
Essential Studio for WinRT界面控件包含了商业平板应用程序开发中所需的所有控件，如市场上运行速度最快的grid 和chart、地图、RDL报表查看器、丰富的文本查看器及图表等等。同时，该控件还包含了一组独特的库，用于从WinRT应用程序中生成Excel、Word以及PDF格式的文件。此文将对其另外一个强大的控件——网格控件进行专门的测评详述。网格控件功能 1、
java 获取windows系统安装的证书或证书链 bewithme windows
有时需要获取windows系统安装的证书或证书链，比如说你要通过证书来创建java的密钥库。有关证书链的解释可以查看此处。 public static void main(String[] args) { SunMSCAPI providerMSCAPI = new SunMSCAPI(); S
NoSQL数据库之Redis数据库管理(set类型和zset类型) bijian1013 redis 数据库 NoSQL
4.sets类型 Set是集合，它是string类型的无序集合。set是通过hash table实现的，添加、删除和查找的复杂度都是O(1)。对集合我们可以取并集、交集、差集。通过这些操作我们可以实现sns中的好友推荐和blog的tag功能。 sadd：向名称为key的set中添加元
异常捕获何时用Exception，何时用Throwable bingyingao
用Exception的情况 try { //可能发生空指针、数组溢出等异常 } catch (Exception e) {
【Kafka四】Kakfa伪分布式安装 bit1129 kafka
在http://bit1129.iteye.com/blog/2174791一文中，实现了单Kafka服务器的安装，在Kafka中，每个Kafka服务器称为一个broker。本文简单介绍下，在单机环境下Kafka的伪分布式安装和测试验证 1. 安装步骤 Kafka伪分布式安装的思路跟Zookeeper的伪分布式安装思路完全一样，不过比Zookeeper稍微简单些(不
Project Euler bookjovi haskell
Project Euler是个数学问题求解网站，网站设计的很有意思，有很多problem，在未提交正确答案前不能查看problem的overview，也不能查看关于problem的discussion thread，只能看到现在problem已经被多少人解决了，人数越多往往代表问题越容易。看看problem 1吧： Add all the natural num
Java-Collections Framework学习与总结-ArrayDeque BrokenDreams Collections
表、栈和队列是三种基本的数据结构，前面总结的ArrayList和LinkedList可以作为任意一种数据结构来使用，当然由于实现方式的不同，操作的效率也会不同。这篇要看一下java.util.ArrayDeque。从命名上看
读《研磨设计模式》-代码笔记-装饰模式-Decorator bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ import java.io.BufferedOutputStream; import java.io.DataOutputStream; import java.io.FileOutputStream; import java.io.Fi
Maven学习(一) chenyu19891124 Maven私服
学习一门技术和工具总得花费一段时间，5月底6月初自己学习了一些工具，maven+Hudson+nexus的搭建，对于maven以前只是听说，顺便再自己的电脑上搭建了一个maven环境，但是完全不了解maven这一强大的构建工具，还有ant也是一个构建工具，但ant就没有maven那么的简单方便，其实简单点说maven是一个运用命令行就能完成构建，测试，打包，发布一系列功
[原创]JWFD工作流引擎设计----节点匹配搜索算法(用于初步解决条件异步汇聚问题) 补充 comsci 算法工作 PHP 搜索引擎嵌入式
本文主要介绍在JWFD工作流引擎设计中遇到的一个实际问题的解决方案，请参考我的博文"带条件选择的并行汇聚路由问题"中图例A2描述的情况(http://comsci.iteye.com/blog/339756),我现在把我对图例A2的一个解决方案公布出来，请大家多指点节点匹配搜索算法(用于解决标准对称流程图条件汇聚点运行控制参数的算法) 需要解决的问题：已知分支
Linux中用shell获取昨天、明天或多天前的日期 daizj linux shell 上几年昨天获取上几个月
在Linux中可以通过date命令获取昨天、明天、上个月、下个月、上一年和下一年 # 获取昨天 date -d 'yesterday' # 或 date -d 'last day' # 获取明天 date -d 'tomorrow' # 或 date -d 'next day' # 获取上个月 date -d 'last month' #
我所理解的云计算 dongwei_6688 云计算
在刚开始接触到一个概念时，人们往往都会去探寻这个概念的含义，以达到对其有一个感性的认知，在Wikipedia上关于“云计算”是这么定义的，它说： Cloud computing is a phrase used to describe a variety of computing co
YII CMenu配置 dcj3sjt126com yii
Adding id and class names to CMenu We use the id and htmlOptions to accomplish this. Watch. //in your view $this->widget('zii.widgets.CMenu', array( 'id'=>'myMenu', 'items'=>$this-&g
设计模式之静态代理与动态代理 come_for_dream 设计模式
静态代理与动态代理代理模式是java开发中用到的相对比较多的设计模式，其中的思想就是主业务和相关业务分离。所谓的代理设计就是指由一个代理主题来操作真实主题，真实主题执行具体的业务操作，而代理主题负责其他相关业务的处理。比如我们在进行删除操作的时候需要检验一下用户是否登陆，我们可以删除看成主业务，而把检验用户是否登陆看成其相关业务
【转】理解Javascript 系列 gcc2ge JavaScript
理解Javascript_13_执行模型详解摘要: 在《理解Javascript_12_执行模型浅析》一文中,我们初步的了解了执行上下文与作用域的概念，那么这一篇将深入分析执行上下文的构建过程，了解执行上下文、函数对象、作用域三者之间的关系。函数执行环境简单的代码:当调用say方法时，第一步是创建其执行环境，在创建执行环境的过程中，会按照定义的先后顺序完成一系列操作:1.首先会创建一个
Subsets II hcx2013 set
Given a collection of integers that might contain duplicates, nums, return all possible subsets. Note: Elements in a subset must be in non-descending order. The solution set must not conta
Spring4.1新特性——Spring缓存框架增强 jinnianshilongnian spring4
目录 Spring4.1新特性——综述 Spring4.1新特性——Spring核心部分及其他 Spring4.1新特性——Spring缓存框架增强 Spring4.1新特性——异步调用和事件机制的异常处理 Spring4.1新特性——数据库集成测试脚本初始化 Spring4.1新特性——Spring MVC增强 Spring4.1新特性——页面自动化测试框架Spring MVC T
shell嵌套expect执行命令 liyonghui160com
一直都想把expect的操作写到bash脚本里,这样就不用我再写两个脚本来执行了,搞了一下午终于有点小成就,给大家看看吧. 系统:centos 5.x 1.先安装expect yum -y install expect 2.脚本内容: cat auto_svn.sh #!/bin/bash
Linux实用命令整理 pda158 linux
0. 基本命令　　linux 基本命令整理　　1. 压缩解压　　tar -zcvf a.tar.gz a #把a压缩成a.tar.gz 　　tar -zxvf a.tar.gz #把a.tar.gz解压成a 　　2. vim小结　　2.1 vim替换　　:m,ns/word_1/word_2/gc
独立开发人员通向成功的29个小贴士 shoothao 独立开发
概述：本文收集了关于独立开发人员通向成功需要注意的一些东西,对于具体的每个贴士的注解有兴趣的朋友可以查看下面标注的原文地址。明白你从事独立开发的原因和目的。保持坚持制定计划的好习惯。万事开头难，第一份订单是关键。培养多元化业务技能。提供卓越的服务和品质。谨小慎微。营销是必备技能。学会组织，有条理的工作才是最有效率的。 “独立
JAVA中堆栈和内存分配原理 uule java
1、栈、堆 1.寄存器：最快的存储区, 由编译器根据需求进行分配,我们在程序中无法控制.2. 栈：存放基本类型的变量数据和对象的引用，但对象本身不存放在栈中，而是存放在堆（new 出来的对象）或者常量池中（字符串常量对象存放在常量池中。）3. 堆：存放所有new出来的对象。4. 静态域：存放静态成员（static定义的）5. 常量池：存放字符串常量和基本类型常量（public static f

15-R-DataCamp-Introduction-to-Data-in-R

15-R-DataCamp-Introduction-to-Data-in-R

1. Languages of Data

1.1 Welcome to the course (video)

1.2 Loading data into R

1.3 Types of variables (video)

1.4 Identify variable types

1.5 Categorical data in R: factors (video)

1.6 Filtering based on a factor

1.7 Complete filtering based on a factor

1.8 Discretize a variable (video)

1.9 Discretize a different variable

1.10 Combining levels of a different factor

1.11 Visualizing numerical data (video)

1.12 Visualizing numerical and categorical data

2. Study Types and Cautionary Tales

2.1 Observational studies and experiments (video)

2.2 Identify type of study: Reading speed and font

2.3 Identify type of study: Countries

2.4 Random sampling and random assignment (video)

2.5 Random sampling or random assignment?

2.6 Identify the scope of inference of study

2.7 Simpsons paradox (video)

2.8 Number of males and females admitted

2.9 Proportion of males admitted overall

2.10 Proportion of males admitted for each department

2.11 Admission rates for males across departments

2.12 Recap: Simpson’s paradox (video)

Identify type of study: Countries [new]

3. Sampling Strategies and Experimental Design

3.1 Sampling strategies (video)

3.2 Sampling strategies, determine which

3.3 Sampling strategies, choose worst

3.4 Sampling in R (video)

3.5 Simple random sample in R

3.6 Stratified sample in R

3.7 Compare SRS vs stratified sample

3.8 Principles of experimental design (video)

3.9 Identifying components of a study

3.10 Experimental design terminology

3.11 Connect blocking and stratifying

4. Case study

4.1 Beauty in the classroom (video)

4.2 Inspect the data

4.3 Identify type of study

4.4 Sampling/ experimental attributes

4.5 Variables in the data (video)

4.6 Identify variable types

4.7 Recode a variable

4.8 Create a scatterplot

4.9 Create a scatterplot, with an added layer

4.10 Congratulations

你可能感兴趣的:(15-R-DataCamp-Introduction-to-Data-in-R)