RNAi experiments in mammalian systems require the application of either in vitro diced long dsRNAs (esiRNAs) or synthetic siRNAs. NEXT-RNAi was used to design a genome-wide siRNA library targeting all human genes annotated by the NCBI RefSeq database (release 40). To this end regions common to all RefSeq transcripts belonging to the same gene were computed for the complete human genome (37,627 regions). Further, sequences of low-complexity were filtered using mdust and remaining sequences longer 100 nt were splitted into two sequences to obtain a higher number of potential target sites for NEXT-RNAi reagent designs. Overall this resulted in 100,270 input sequences used as input for NEXT-RNAi.
NEXT-RNAi HTML ouputs are available here
Overall 100,264 designs were obtained, covering 99.9% of the genome. 83.4% of all genes are covered by at least one design that does not show homology of 19 nt to any other gene. 97% of all genes are additionally covered by at least one second, independent design.
Input files and settings used
Input FASTA file
human.rnaMOD.COMMON_mdust_split_crsplit.zip (20MB) containing target sequences as input file (-i input).
Targetgroup file (tab-delimited)
TargetGroups_GeneID.tab (1.3MB) defining which RefSeq transcripts belong to the same gene (headers Target and TargetGroup) (TARGETGROUPS option)
Bowtie database/index for off-target evaluation
Bowtie database/index containing annotated RefSeq transcripts (release 40) for specificity calculations (-d input):
Feature file with UTR and SNP locations
Tab-delimited feature file containing mappings of UTRs and SNPs (from NCBI dbSNP) to chromosomes that is used to calculate UTR and SNP 'contents' (FEATURE option) of designed reagents: Hs_UTR_SNP.tar.gz (484MB)
FASTA file for homology evaluation
Transcriptome FASTA file to evaluate the homology of the designs using Blast (HOMOLOGY and TXNFASTA options):
Bowtie database/index for seed match evaluation
To compute the number of siRNA seed matches (seed complement frequency) a Bowtie database/index containing all annotated 3'-UTR sequences (RefSeq release 40) was generated to be used with the SEEDMATCH option (Hs_3UTR.tar.gz (26MB)).
Start of program
Descriptions for start parameters used are available here.
Descriptions for all options used are available here.