Useful links

under construction ……..

Genome Browsers

Statistical Analysis of high-throughput data

Machine Learning

Next Generation Sequencing Tools

  • Novoalign The most accurate aligner to date for single-ended and paired-end reads from the Illumina Genome Analyser & 454 paired end reads.
  • BWA Burrows-Wheeler Aligner (BWA) is an efficient program that aligns relatively short nucleotide sequences against a long reference sequence such as the human genome
  • The Genome Analysis Toolkit The Genome Analysis Toolkit or GATK is a software package developed at the Broad Institute to analyse next-generation resequencing data
  • samtools  SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format.
  • vcftools a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of    VCFtools is to provide methods for working with VCF files: validating,     merging, comparing and calculate some basic population genetic statistics
  • BEDTools a flexible suite of utilities for comparing genomic features: The BEDTools utilities allow one to address common genomics tasks such as finding feature overlaps and computing coverage. The utilities are largely based on four widely-used file formats: BED, GFF/GTF, VCF, and SAM/BAM. Using BEDTools, one can develop sophisticated pipelines that answer complicated research questions by “streaming” several BEDTools together.
  • FASTX-Toolkit The FASTX-Toolkit is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing
  • FastQC  A quality control tool for high throughput sequence data.
  • Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations
  • DELLY is an integrated structural variant prediction method that can detect deletions, tandem duplications, inversions and translocations at single-nucleotide resolution in short-read massively parallel sequencing data. It uses paired-ends and split-reads to sensitively and accurately delineate genomic rearrangements throughout the genome

Variant Annotation for NGS data

Genetic Analysis Software

  • PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.
  • SNPTEST a program for Frequentist and Bayesian tests of SNP association with binary (case-control) and quantitative phenotypes that takes genotype uncertainty into account.
  • QUICKTEST The software implements the statistical methods for uncertain (imputed) genotype association testing that were published in the article Methods for testing association between uncertain genotypes and quantitative traits . USE THIS FOR QUANTITATIVE TRAITS!
  • GenABEL a set of R packages for the analysis of genetic data.  Includes tools for data management, file conversions (e.g. impute to mach format), efficient storage, analysis of genotyped and imputed (e.g. dosage) data, meta-analysis, prediction and more.
  • IMPUTE2 a program for genotype imputation and phasing in genome-wide association studies and fine-mapping
  • SHAPEIT a program for accurate and efficient phasing of genetic datasets
  • BEAGLE  a state of the art software  package for analysis of large-scale genetic data sets with hundreds of thousands  of markers genotyped on thousands of samples. BEAGLE can :
    1. phase genotype data (i.e. infer   haplotypes) for unrelated individuals, parent-offspring pairs, and   parent-offspring trios.
    2. infer sporadic missing genotype data.
    3. impute ungenotyped markers that have   been genotyped in a reference panel.
    4. perform single marker and haplotypic   association analysis.
    5. detect genetic regions that are   homozygous-by-descent in an individual or identical-by-descent in pairs of individuals
    6. BEAGLE Utilities This page includes simple utility programs for manipulating text  files.  If you are performing analyses using BEAGLE, you may find some of these programs to be useful for  preparing input files and for working with output files. The BEAGLE utilities  are written in java and run on all common computing platforms (e.g.  Windows, Unix, Linux, Solaris, Mac).
  • EIGENTSTRAT detects and corrects for population stratification in genome-wide association studies. The method, based on principal components analysis, explicitly models ancestry differences between cases and controls along continuous axes of variation. The resulting correction is specific to a candidate marker’s variation in frequency across ancestral populations, minimizing spurious associations while maximizing power to detect true associations. The approach is powerful as well as fast, and can easily be applied to disease studies with hundreds of thousands of markers.
  • GWAPOWER an R package for assessing the power of genome-wide association studies using commercially available genotyping chips. The package encapsulates extensive simulation results generated by our program HAPGEN and described fully in the paper
  • META a program to carryout meta-analysis of genetic studies
  • GWAMA  Genome-Wide Association Meta Analysis software to perform meta-analysis of the results of GWA studies of binary or quantitative phenotypes
  • METAL  The METAL software is designed to facilitate meta-analysis of large datasets (such as several whole genome scans) in a convenient, rapid and memory efficient manner.
  • INRICH: Interval-based Enrichment Analysis Tool for Genome Wide Association Studies
  • FORGE FORGE is tool to perform gene based Genome-Wide Association Studies. It allows to combine information from different genetic variants into a single statistic. We have shown it provides additional power to detect true disease loci and it is useful to perform pathway or network analyses (Pedroso et al . submitted)
  • Mike Weale’s GWAS  tools(Mike Weale is a statistical geneticist at King’s College London)
    1. Bayes Factors
    2. GWAS code
    3. EIGENSOFTplus
    4. Manhattan plots
    5. QQ plots



  • CIT algorithm (R script) CITtest.r Disentangling molecular relationships with a causal inference test (Joshua Millstein, Bin Zhang, Jun Zhu, Eric E. Schadt)

Leave a Reply

Your email address will not be published.

You may use these <abbr title="HyperText Markup Language">html</abbr> tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>