miRNAseq数据分析这么多年了它的流程也没有固定

测序数据质控环节:
Sequencing was performed on a HiSeq2000 instrument running TruSeq version 3 chemistry for 50 cycles. Base calling and quality score calculation was performed from raw intensities using Illumina’s pipeline version 1.8.1. The called reads were trimmed with the command line: fastx_trimmer –f 1 –l 36 and low-quality reads discarded with fastx_artifacts_filter using the options –q 10. Adapters were clipped using the AdRec.jar program from the seqBuster suite with the following options: java -jar AdRec.jar 1 8 0.3.
如果要发现新的miRNA,需要比对到参考基因组。
使用bowtie –f –v 0 –a –m 5 --strata --best; 比对miRNA的FASTA文件到人类参考基因组 删除属于annotations of tRNA or rRNA (RepeatMasker hg19)和 known miRNA hairpins (miRBase version 19)的已知miRNA

Tophat 首次被发表已经是6年前 Cufflinks也是五年前的事情了 Star的比对速度是tophat的50倍,hisat更是star的1.2倍。 stringTie的组装速度是cufflinks的25倍,但是内存消耗却不到其一半。 Ballgown在差异分析方面比cuffdiff更高的特异性及准确性,且时间消耗不到cuffdiff的千分之一 Bowtie2+eXpress做质量控制优于tophat2+cufflinks和bowtie2+RSEM Sailfish更是跳过了比对的步骤,直接进行kmer计数来做QC,特异性及准确性都还行,但是速度提高了25倍 kallisto同样不需要比对,速度比sailfish还要提高5倍!!!



Our adapter-trimming algorithm identified as long an adapter sequence as possible, allowing a number of mismatches that depended on the adapter length found. Because the shortest mature miRNA in miRBase v16 is 15 bp, we discarded any trimmed read that was shorter than 15 bp. We used BWA-MEM with parameters samse -n 10 to align the remaining reads to a reference genome, which, for most TCGA cancers, was GRCh37
https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/Expression_mRNA_Pipeline/
https://github.com/bcgsc/mirna
文末友情宣传
生信爆款入门-全球听(买一得五)(第4期),你的生物信息学入门课 数据挖掘第2期(两天变三周,实力加量),医学生/临床医师首选技能提高课 生信技能树的2019年终总结 ,你的生物信息学成长宝藏 2020学习主旋律,B站74小时免费教学视频为你领路,还等什么,看啊!!!

赞 (0)
