infercnv运行测试8--ref_group_names=NULL

5000个细胞
1.4h

> rm(list=ls())
> options(stringsAsFactors = F)
> library(Seurat)
> library(ggplot2)
> library(infercnv)
> expFile='expFile.txt'
> groupFiles='groupFiles.txt'
> geneFile='geneFile.txt'
> infercnv_obj = CreateInfercnvObject(raw_counts_matrix=expFile,
+                                     annotations_file=groupFiles,
+                                     delim="\t",
+                                     gene_order_file= geneFile,
+                                     ref_group_names=c('ref-fib'))  ## 这个取决于自己的分组信息里面的
INFO [2021-03-11 18:09:49] Parsing matrix: expFile.txt

> infercnv_obj = CreateInfercnvObject(raw_counts_matrix=expFile,
+                                     annotations_file=groupFiles,
+                                     delim="\t",
+                                     gene_order_file= geneFile,
+                                     ref_group_names=NULL)  ## 这个取决于自己的分组信息里面的
INFO [2021-03-11 18:13:59] Parsing matrix: expFile.txt
INFO [2021-03-11 18:14:37] Parsing gene order file: geneFile.txt
INFO [2021-03-11 18:14:38] Parsing cell annotations file: groupFiles.txt
INFO [2021-03-11 18:14:38] ::order_reduce:Start.
INFO [2021-03-11 18:14:38] .order_reduce(): expr and order match.
INFO [2021-03-11 18:14:38] ::process_data:order_reduce:Reduction from positional data, new dimensions (r,c) = 17140,5312 Total=63742646 Min=0 Max=2245.
INFO [2021-03-11 18:14:38] num genes removed taking into account provided gene ordering list: 536 = 3.12718786464411% removed.
INFO [2021-03-11 18:14:38] -filtering out cells < 100 or > Inf, removing 0 % of cells
INFO [2021-03-11 18:14:40] validating infercnv_obj
> 
> ##  文献的代码:#14:58开始
> start_time <- Sys.time()
> infercnv_obj2 = infercnv::run(infercnv_obj,
+                               cutoff=0.1,  # cutoff=1 works well for Smart-seq2, and cutoff=0.1 works well for 10x Genomics
+                               out_dir='plot_out2/' , 
+                               cluster_by_groups=T,   # cluster
+                               hclust_method="ward.D2", 
+                               plot_steps=F,
+                               denoise=T,
+                               HMM=T) 
INFO [2021-03-11 18:14:40] ::process_data:Start
INFO [2021-03-11 18:14:40] Creating output path plot_out2/
INFO [2021-03-11 18:14:40] Checking for saved results.
INFO [2021-03-11 18:14:40] 
STEP 1: incoming data

INFO [2021-03-11 18:14:58]

STEP 02: Removing lowly expressed genes

INFO [2021-03-11 18:14:58] ::above_min_mean_expr_cutoff:Start
INFO [2021-03-11 18:14:58] Removing 10290 genes from matrix as below mean expr threshold: 0.1
INFO [2021-03-11 18:14:58] validating infercnv_obj
INFO [2021-03-11 18:14:58] There are 6314 genes and 5312 cells remaining in the expr matrix.
INFO [2021-03-11 18:15:00] no genes removed due to min cells/gene filter
INFO [2021-03-11 18:15:12]

STEP 03: normalization by sequencing depth

INFO [2021-03-11 18:15:12] normalizing counts matrix by depth
INFO [2021-03-11 18:15:16] Computed total sum normalization factor as median libsize: 9562.000000
INFO [2021-03-11 18:15:16] Adding h-spike
INFO [2021-03-11 18:15:16] -no normals defined, using all observation cells as proxy
INFO [2021-03-11 18:15:16] -hspike modeling of normalsToUse
INFO [2021-03-11 18:16:35] validating infercnv_obj
INFO [2021-03-11 18:16:35] normalizing counts matrix by depth
INFO [2021-03-11 18:16:35] Using specified normalization factor: 9562.000000
INFO [2021-03-11 18:16:48]

STEP 04: log transformation of data

INFO [2021-03-11 18:16:48] transforming log2xplus1()
INFO [2021-03-11 18:16:50] -mirroring for hspike
INFO [2021-03-11 18:16:50] transforming log2xplus1()
INFO [2021-03-11 18:17:03]

STEP 08: removing average of reference data (before smoothing)

INFO [2021-03-11 18:17:03] ::subtract_ref_expr_from_obs:Start inv_log=FALSE, use_bounds=TRUE
INFO [2021-03-11 18:17:03] -no reference cells specified... using mean of all cells as proxy
INFO [2021-03-11 18:17:08] -subtracting expr per gene, use_bounds=TRUE
INFO [2021-03-11 18:17:11] -mirroring for hspike
INFO [2021-03-11 18:17:11] ::subtract_ref_expr_from_obs:Start inv_log=FALSE, use_bounds=TRUE
INFO [2021-03-11 18:17:11] subtracting mean(normal) per gene per cell across all data
INFO [2021-03-11 18:17:13] -subtracting expr per gene, use_bounds=TRUE
INFO [2021-03-11 18:17:33]

STEP 09: apply max centered expression threshold: 3

INFO [2021-03-11 18:17:33] ::process_data:setting max centered expr, threshold set to: +/-: 3
INFO [2021-03-11 18:17:33] -mirroring for hspike
INFO [2021-03-11 18:17:33] ::process_data:setting max centered expr, threshold set to: +/-: 3
INFO [2021-03-11 18:17:53]

STEP 10: Smoothing data per cell by chromosome

INFO [2021-03-11 18:17:53] smooth_by_chromosome: chr: chr1
INFO [2021-03-11 18:17:59] smooth_by_chromosome: chr: chr10
INFO [2021-03-11 18:18:04] smooth_by_chromosome: chr: chr11
INFO [2021-03-11 18:18:10] smooth_by_chromosome: chr: chr12
INFO [2021-03-11 18:18:15] smooth_by_chromosome: chr: chr13
INFO [2021-03-11 18:18:20] smooth_by_chromosome: chr: chr14
INFO [2021-03-11 18:18:25] smooth_by_chromosome: chr: chr15
INFO [2021-03-11 18:18:30] smooth_by_chromosome: chr: chr16
INFO [2021-03-11 18:18:36] smooth_by_chromosome: chr: chr17
INFO [2021-03-11 18:18:41] smooth_by_chromosome: chr: chr18
INFO [2021-03-11 18:18:45] smooth_by_chromosome: chr: chr19
INFO [2021-03-11 18:18:51] smooth_by_chromosome: chr: chr2
INFO [2021-03-11 18:18:56] smooth_by_chromosome: chr: chr20
INFO [2021-03-11 18:19:02] smooth_by_chromosome: chr: chr21
INFO [2021-03-11 18:19:04] smooth_by_chromosome: chr: chr22
INFO [2021-03-11 18:19:09] smooth_by_chromosome: chr: chr3
INFO [2021-03-11 18:19:14] smooth_by_chromosome: chr: chr4
INFO [2021-03-11 18:19:20] smooth_by_chromosome: chr: chr5
INFO [2021-03-11 18:19:26] smooth_by_chromosome: chr: chr6
INFO [2021-03-11 18:19:31] smooth_by_chromosome: chr: chr7
INFO [2021-03-11 18:19:36] smooth_by_chromosome: chr: chr8
INFO [2021-03-11 18:19:41] smooth_by_chromosome: chr: chr9
INFO [2021-03-11 18:19:46] -mirroring for hspike
INFO [2021-03-11 18:19:46] smooth_by_chromosome: chr: chrA
INFO [2021-03-11 18:19:47] smooth_by_chromosome: chr: chr_0
INFO [2021-03-11 18:19:47] smooth_by_chromosome: chr: chr_B
INFO [2021-03-11 18:19:47] smooth_by_chromosome: chr: chr_0pt5
INFO [2021-03-11 18:19:47] smooth_by_chromosome: chr: chr_C
INFO [2021-03-11 18:19:48] smooth_by_chromosome: chr: chr_1pt5
INFO [2021-03-11 18:19:48] smooth_by_chromosome: chr: chr_D
INFO [2021-03-11 18:19:48] smooth_by_chromosome: chr: chr_2pt0
INFO [2021-03-11 18:19:48] smooth_by_chromosome: chr: chr_E
INFO [2021-03-11 18:19:48] smooth_by_chromosome: chr: chr_3pt0
INFO [2021-03-11 18:19:49] smooth_by_chromosome: chr: chr_F
INFO [2021-03-11 18:20:08]

STEP 11: re-centering data across chromosome after smoothing

INFO [2021-03-11 18:20:08] ::center_smooth across chromosomes per cell
INFO [2021-03-11 18:20:14] -mirroring for hspike
INFO [2021-03-11 18:20:14] ::center_smooth across chromosomes per cell
INFO [2021-03-11 18:20:34]

STEP 12: removing average of reference data (after smoothing)

INFO [2021-03-11 18:20:34] ::subtract_ref_expr_from_obs:Start inv_log=FALSE, use_bounds=TRUE
INFO [2021-03-11 18:20:34] -no reference cells specified... using mean of all cells as proxy
INFO [2021-03-11 18:20:40] -subtracting expr per gene, use_bounds=TRUE
INFO [2021-03-11 18:20:43] -mirroring for hspike
INFO [2021-03-11 18:20:43] ::subtract_ref_expr_from_obs:Start inv_log=FALSE, use_bounds=TRUE
INFO [2021-03-11 18:20:43] subtracting mean(normal) per gene per cell across all data
INFO [2021-03-11 18:20:45] -subtracting expr per gene, use_bounds=TRUE
INFO [2021-03-11 18:21:05]

STEP 14: invert log2(FC) to FC

INFO [2021-03-11 18:21:05] invert_log2(), computing 2^x
INFO [2021-03-11 18:21:09] -mirroring for hspike
INFO [2021-03-11 18:21:09] invert_log2(), computing 2^x
INFO [2021-03-11 18:21:36]

STEP 15: Clustering samples (not defining tumor subclusters)

INFO [2021-03-11 18:21:36] define_signif_tumor_subclusters(p_val=0.1
INFO [2021-03-11 18:21:36] define_signif_tumor_subclusters(), tumor: epi
INFO [2021-03-11 19:23:38] cut tree into: 1 groups
INFO [2021-03-11 19:23:38] -processing epi,epi_s1
INFO [2021-03-11 19:23:38] -mirroring for hspike
INFO [2021-03-11 19:23:38] define_signif_tumor_subclusters(p_val=0.1
INFO [2021-03-11 19:23:38] define_signif_tumor_subclusters(), tumor: spike_tumor_cell_normalsToUse
INFO [2021-03-11 19:23:38] cut tree into: 1 groups
INFO [2021-03-11 19:23:38] -processing spike_tumor_cell_normalsToUse,spike_tumor_cell_normalsToUse_s1
INFO [2021-03-11 19:23:38] define_signif_tumor_subclusters(), tumor: simnorm_cell_normalsToUse
INFO [2021-03-11 19:23:38] cut tree into: 1 groups
INFO [2021-03-11 19:23:38] -processing simnorm_cell_normalsToUse,simnorm_cell_normalsToUse_s1
INFO [2021-03-11 19:24:32] ::plot_cnv:Start
INFO [2021-03-11 19:24:32] ::plot_cnv:Current data dimensions (r,c)=6314,5312 Total=33581835.5515774 Min=0.678063585770801 Max=1.63133365772827.
INFO [2021-03-11 19:24:32] ::plot_cnv:Depending on the size of the matrix this may take a moment.
INFO [2021-03-11 19:26:11] plot_cnv(): auto thresholding at: (0.867432 , 1.135064)
INFO [2021-03-11 19:26:12] plot_cnv_observation:Start
INFO [2021-03-11 19:26:12] Observation data size: Cells= 5312 Genes= 6314
INFO [2021-03-11 19:26:12] plot_cnv_observation:Writing observation groupings/color.
INFO [2021-03-11 19:26:13] plot_cnv_observation:Done writing observation groupings/color.
INFO [2021-03-11 19:26:13] plot_cnv_observation:Writing observation heatmap thresholds.
INFO [2021-03-11 19:26:13] plot_cnv_observation:Done writing observation heatmap thresholds.
INFO [2021-03-11 19:26:19] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000
INFO [2021-03-11 19:26:19] Quantiles of plotted data range: 0.867432335329716,0.968597910814697,0.99904837579043,1.03138128709615,1.13506424138601
INFO [2021-03-11 19:26:25] plot_cnv_observations:Writing observation data to plot_out2//infercnv.preliminary.observations.txt
INFO [2021-03-11 19:28:05]

STEP 17: HMM-based CNV prediction

INFO [2021-03-11 19:28:05] predict_CNV_via_HMM_on_whole_tumor_samples
INFO [2021-03-11 19:28:07] -done predicting CNV based on initial tumor subclusters
INFO [2021-03-11 19:28:16] get_predicted_CNV_regions(subcluster)
INFO [2021-03-11 19:28:16] -processing cell_group_name: epi.epi_s1, size: 5312
INFO [2021-03-11 19:28:59] -writing cell clusters file: plot_out2//17_HMM_predHMMi6.hmm_mode-samples.cell_groupings
INFO [2021-03-11 19:28:59] -writing cnv regions file: plot_out2//17_HMM_predHMMi6.hmm_mode-samples.pred_cnv_regions.dat
INFO [2021-03-11 19:28:59] -writing per-gene cnv report: plot_out2//17_HMM_predHMMi6.hmm_mode-samples.pred_cnv_genes.dat
INFO [2021-03-11 19:28:59] -writing gene ordering info: plot_out2//17_HMM_predHMMi6.hmm_mode-samples.genes_used.dat
INFO [2021-03-11 19:29:00] ::plot_cnv:Start
INFO [2021-03-11 19:29:00] ::plot_cnv:Current data dimensions (r,c)=6314,5312 Total=100619904 Min=3 Max=3.
INFO [2021-03-11 19:29:00] ::plot_cnv:Depending on the size of the matrix this may take a moment.
INFO [2021-03-11 19:30:04] plot_cnv_observation:Start
INFO [2021-03-11 19:30:04] Observation data size: Cells= 5312 Genes= 6314
INFO [2021-03-11 19:30:04] plot_cnv_observation:Writing observation groupings/color.
INFO [2021-03-11 19:30:04] plot_cnv_observation:Done writing observation groupings/color.
INFO [2021-03-11 19:30:04] plot_cnv_observation:Writing observation heatmap thresholds.
INFO [2021-03-11 19:30:04] plot_cnv_observation:Done writing observation heatmap thresholds.
INFO [2021-03-11 19:30:11] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000
INFO [2021-03-11 19:30:11] Quantiles of plotted data range: 3,3,3,3,3
INFO [2021-03-11 19:30:17] plot_cnv_observations:Writing observation data to plot_out2//infercnv.17_HMM_predHMMi6.hmm_mode-samples.observations.txt
INFO [2021-03-11 19:31:21]

STEP 19: Converting HMM-based CNV states to repr expr vals

INFO [2021-03-11 19:31:32] ::plot_cnv:Start
INFO [2021-03-11 19:31:32] ::plot_cnv:Current data dimensions (r,c)=6314,5312 Total=33539968 Min=1 Max=1.
INFO [2021-03-11 19:31:32] ::plot_cnv:Depending on the size of the matrix this may take a moment.
INFO [2021-03-11 19:32:34] plot_cnv_observation:Start
INFO [2021-03-11 19:32:34] Observation data size: Cells= 5312 Genes= 6314
INFO [2021-03-11 19:32:35] plot_cnv_observation:Writing observation groupings/color.
INFO [2021-03-11 19:32:35] plot_cnv_observation:Done writing observation groupings/color.
INFO [2021-03-11 19:32:35] plot_cnv_observation:Writing observation heatmap thresholds.
INFO [2021-03-11 19:32:35] plot_cnv_observation:Done writing observation heatmap thresholds.
INFO [2021-03-11 19:32:42] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000
INFO [2021-03-11 19:32:42] Quantiles of plotted data range: 1,1,1,1,1
INFO [2021-03-11 19:32:48] plot_cnv_observations:Writing observation data to plot_out2//infercnv.19_HMM_predHMMi6.hmm_mode-samples.Pnorm_0.5.repr_intensities.observations.txt
INFO [2021-03-11 19:33:50]

STEP 21: Denoising

INFO [2021-03-11 19:33:50] ::process_data:Remove noise, noise threshold defined via ref mean sd_amplifier: 1.5
INFO [2021-03-11 19:33:50] -no reference cells specified... using mean and sd of all cells as proxy for denoising
INFO [2021-03-11 19:33:52] :: **** clear_noise_via_ref_quantiles **** : removing noise between bounds: 0.92762824951991 - 1.07486832719581
INFO [2021-03-11 19:34:03] ::plot_cnv:Start
INFO [2021-03-11 19:34:03] ::plot_cnv:Current data dimensions (r,c)=6314,5312 Total=33644048.472462 Min=0.678063585770801 Max=1.63133365772827.
INFO [2021-03-11 19:34:03] ::plot_cnv:Depending on the size of the matrix this may take a moment.
INFO [2021-03-11 19:35:53] plot_cnv(): auto thresholding at: (0.871142 , 1.135064)
INFO [2021-03-11 19:35:54] plot_cnv_observation:Start
INFO [2021-03-11 19:35:54] Observation data size: Cells= 5312 Genes= 6314
INFO [2021-03-11 19:35:55] plot_cnv_observation:Writing observation groupings/color.
INFO [2021-03-11 19:35:55] plot_cnv_observation:Done writing observation groupings/color.
INFO [2021-03-11 19:35:56] plot_cnv_observation:Writing observation heatmap thresholds.
INFO [2021-03-11 19:35:56] plot_cnv_observation:Done writing observation heatmap thresholds.
INFO [2021-03-11 19:36:04] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000
INFO [2021-03-11 19:36:04] Quantiles of plotted data range: 0.871142113519701,1.00124828835786,1.00124828835786,1.00124828835786,1.13506424138601
INFO [2021-03-11 19:36:12] plot_cnv_observations:Writing observation data to plot_out2//infercnv.21_denoised.observations.txt
INFO [2021-03-11 19:38:08]

Making the final infercnv heatmap

INFO [2021-03-11 19:38:09] ::plot_cnv:Start
INFO [2021-03-11 19:38:09] ::plot_cnv:Current data dimensions (r,c)=6314,5312 Total=33644048.472462 Min=0.678063585770801 Max=1.63133365772827.
INFO [2021-03-11 19:38:09] ::plot_cnv:Depending on the size of the matrix this may take a moment.
INFO [2021-03-11 19:39:54] plot_cnv(): auto thresholding at: (0.864936 , 1.135064)
INFO [2021-03-11 19:39:55] plot_cnv_observation:Start
INFO [2021-03-11 19:39:55] Observation data size: Cells= 5312 Genes= 6314
INFO [2021-03-11 19:39:56] plot_cnv_observation:Writing observation groupings/color.
INFO [2021-03-11 19:39:56] plot_cnv_observation:Done writing observation groupings/color.
INFO [2021-03-11 19:39:56] plot_cnv_observation:Writing observation heatmap thresholds.
INFO [2021-03-11 19:39:56] plot_cnv_observation:Done writing observation heatmap thresholds.
INFO [2021-03-11 19:40:03] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000
INFO [2021-03-11 19:40:03] Quantiles of plotted data range: 0.864935758613995,1.00124828835786,1.00124828835786,1.00124828835786,1.13506424138601
INFO [2021-03-11 19:40:08] plot_cnv_observations:Writing observation data to plot_out2//infercnv.observations.txt
Warning messages:
1: In dir.create(out_dir) : 'plot_out2' already exists
2: In dir.create(out_dir) : 'plot_out2' already exists
3: In dir.create(out_dir) : 'plot_out2' already exists
4: In dir.create(out_dir) : 'plot_out2' already exists
5: In dir.create(out_dir) : 'plot_out2' already exists

end_time <- Sys.time()
end_time
[1] "2021-03-11 19:41:46 CST"
end_time - start_time
Time difference of 1.451607 hours

你可能感兴趣的:(infercnv运行测试8--ref_group_names=NULL)