RNAi experiments in mammalian systems require the application of either in vitro diced long dsRNAs (esiRNAs) or synthetic siRNAs. NEXT-RNAi was used to design a genome-wide esiRNA library targeting all human genes annotated by the NCBI RefSeq database (release 40). To this end regions common to all RefSeq transcripts belonging to the same gene were computed for the complete human genome (37,627 regions). Further, sequences of low-complexity were filtered using mdust and remaining sequences longer 560 nt were splitted into two sequences to obtain a higher number of potential target sites for NEXT-RNAi reagent designs. Overall this resulted in 83,416 input sequences used as input for NEXT-RNAi.
NEXT-RNAi HTML ouputs are available here
Overall 82,516 designs were obtained, covering 97.8% of the genome. 73.8% of all genes are covered by at least one design that does not show homology of 19 nt or longer to any other gene. 88.4% of all genes are additionally covered by at least one second, independent design.
Input files and settings used
Input FASTA file
human.rnaMOD.COMMON_mdust_split_crsplit.zip (20MB) containing target sequences as input file (-i input).
Targetgroup file (tab-delimited)
TargetGroups_GeneID.tab (1.3MB) defining which RefSeq transcripts belong to the same gene (headers Target and TargetGroup) (TARGETGROUPS option)
Bowtie database/index for off-target evaluation
Bowtie database/index containing annotated RefSeq transcripts (release 40) for specificity calculations (-d input):
Feature file with UTR and SNP locations
Tab-delimited feature file containing mappings of UTRs and SNPs (from NCBI dbSNP) to chromosomes that is used to calculate UTR and SNP 'contents' (FEATURE option) of designed reagents: Hs_UTR_SNP.tar.gz (484MB)
FASTA file for homology evaluation
Transcriptome FASTA file to evaluate the homology of the designs using Blast (HOMOLOGY option):
Start of program
Descriptions for start parameters used are available here.
Descriptions for all options used are available here.