NEXT-RNAi Version 1.31 (22/04/2010) - Designing and evaluating genome-wide libraries for RNAi screens
NEXT-RNAi is a software for the design and evaluation of genome-wide RNAi libraries and performs all steps from the prediction of specific and efficient RNAi target sites to the visualization of designed reagents in their genomic context. The software enables the design and evaluation of siRNAs and long dsRNAs and was implemented in an organism-independent manner allowing designs for all sequenced and annotated genomes.
Please visit http://www.nextrnai.org/ for complete documentation of NEXT-RNAi.
Inputfile containing target sequences (in FASTA format)
<int> number of features (FASTA sequences) from input file that are processed at once (optional, default=4000)
Reagent type (d = long dsRNA, s = short interfering RNA) designed or evaluated
-d <Bowtie database/index>
Location of Bowtie database/index file (pre-build with bowtie-build), multiple inputs are allowed (separated by ’+’)(optional, if set to ’nodb’ NEXT-RNAi will run
without ’off-target’ evaluation)
NO: de novo design of RNAi reagents
OLIGO: evaluation of primers for long dsRNAs (-r d) or siRNAs (-r s)
DSRNA: evaluation of long dsRNAs (-r d)
DSRNA+OLIGO: evaluation of long dsRNAs and underlying primers (-r d)
File containing further settings for RNAi reagent design/evaluation in a TAG=VALUE format (optional)
-n <probe name>
Name tag for files generated by NEXT-RNAi (optional, default=Probe)
Show help (optional)
-p <interactive mode>
Start interactive setting of NEXT-RNAi options (optional)
PARAMETERS FOR OPTIONS FILE
Program locations (NEXT-RNAi dependencies)
Set location of primer3_core script required for primer designs during the design and evaluation of long dsRNAs (default = /usr/bin/). Primer3 settings can be influenced in an additional options file (see PRIMER3OPT below).
Set location of bowtie script required by NEXT-RNAi (default = /usr/bin/). Bowtie is used for mappings to determine the specificity of an RNAi reagent (against the database defined with -d) and for mappings to determine the location of an RNAi reagent in the genome. The mapping of RNAi reagents is a prerequisite for generation of GFF and AFF output files and for the calculation of FEATURE contents and requires the definition of a mapping database (GENOMEBOWTIE).
Set location of mdust program for the evaluation of low-complexity regions in the input sequences (default = disabled).
Set location of blat program for mapping RNAi reagents to the genome (default = disabled). By default BOWTIE is used for mappings. However, if reagents were designed on CDS (SOURCE=CDS) Blat is required to allow for gapped alignments to the genome. BLAT mapping can be influenced by a set of further options (see GENOMEFASTA, BLATALIGN, BLATSPLIT, BLATPROGRAM, BLATHOST, BLATPORT below). The mapping of RNAi reagents is a prerequisite for generation of GFF and AFF output files and for the calculation of FEATURE contents.
Set location of blastall program, the FASTA database used to determine homology (needs prior formatting to a FASTA database with formatdb command included in Blast package) and the e-value cutoff for homology (e.g. 1e-10). These three input parameters are separated by comma (default = disabled).
Set location of RNAfold.pl script that belongs to the Vienna RNA package (default = /usr/bin/). This program is required only for efficiency predictions using the RATIONAL method (see EFFICIENCY below).
Set length [nt] of siRNAs used for off-target evaluation (default = 19).
Set minimal and maximal length of desired RNAi reagents (default = ’80,500’) separated by comma.
Set number of RNAi reagents to be designed for each identified specific region in the queried target sequences (default = 50).
Set number of RNAi reagents to be returned for each queried target sequence (default = 1).
Set location of file with options for PRIMER3 program in ’TAG=VALUE’ format (visit Primer3 documentation for help), default settings are used otherwise (default = disabled).
Set sequence to be added 5’ to both, forward and reverse primer sequences (for the design of long dsRNAs), e.g. a T7- or SP6-tag for in vitro transcription (default = disabled).
Set efficiency calculation method and efficiency cutoff score separated by comma (e.g. ’EFFICIENCY=SIR,50’). Available calculation methods are ’RATIONAL’ for calculations according to Reynolds et al. (requires VIENNA software), or ’SIR’ according to Shah et al. (default = ’SIR’). The efficiency cutoff defines the minimal required efficiency for a siRNA to be selected (only for de novo designs, -e NO). Further documentation about efficiency prediction is available here.
If set to ’FULL’ NEXT-RNAi is forced to use the complete input target sequences as design template, otherwise only calculated specific regions are considered (default = CALC for de novo designs, default = FULL for evaluations).
Set location of file containing feature location information. A tab-delimited file with headers ’FeatureName’, ’FeatureLoc’ (location in GENOMEBOWTIE / GENOMEFASTA database), ’FeatureStart’ (start of feature in GENOMEBOWTIE / GENOMEFASTA) and ’FeatureEnd’ (end of feature in GENOMEBOWTIE / GENOMEFASTA database). Requires mapping of reagents to GENOMEBOWTIE or GENOMEFASTA databases (default = disabled).
Set location of mdust program for the evaluation of low-complexity regions in the input sequences (default = disabled).
Option for calculation of CA[ACGT] tandem trinucleotide repeats in target or reagent sequences. This option is enabled by setting the minimal number of CAN repeats (e.g. 6) to be detected (default = disabled).
Calculation of seed matches from siRNA sense strand (starting at position 2) to a defined FASTA file OR a Bowtie database/index file (if a FASTA file was provided, NEXT-RNAi expects the bowtie-build script for building the bowtie index in the BOWTIE folder). This option requires setting the length of the seed region (between 6 and 8), the maximal seed complement frequency allowed (for filtering of target sequences) and the location of the FASTA file or Bowtie database/index
(pre-build with bowtie-build) separated by comma (default = disabled).
Calculation of (e.g. miRNA-) seeds within a long dsRNA or siRNA from a given FASTA file containing miRNA sequences. Requires length of seed region (between 6 and 8, starting from position 2 in miRNA sense sequences) and location of FASTA file (separated by comma), siRNAs containing seeds will be excluded from designs (default = disabled).
Results for siRNA evaluations can be summarized for pools of sequences. This option requires setting of the location of a tab-delimited file containing the headers ’siRNAID’ and ’POOLID’ to define connections between query siRNA identifiers and corresponding siRNA-pool identifiers. This options is only available for the evaluation of siRNAs (default = disabled).
Set location of FASTA file or Bowtie database/index containing sequences that should be avoided for independent reagent designs (file is appended to the off-target database) (default = disabled). In case a FASTA file was provided the ’bowtie-build’ script is required in the location defined for BOWTIE (to build the Bowtie index).
Percentage nt of a long dsRNA allowed to target intronic regions (default = 25).
Long dsRNA designs are by default ranked for percent specificity in first place and number of contained siRNAs predicted to be efficient in second place. NEXT-RNAi can be forced to rank designs for the absolute number of specific siRNAs contained in the long dsRNAs in second place (RANKD = SPEC), which maximized the length of long dsRNA designs (default = EFF for efficiency ranking in second place).
Define whether NEXT-RNAi is allowed to enter a (re-)design method (REDESIGN = ON) to enable the design of RNAi reagents for input sequences that do not meet the user-defined quality measures (specificity (SIRNALENGTH), EFFICIENCY, LOWCOMPEVAL, CANEVAL, SEEDMATCH and MIRSEED) (default = OFF).
For evaluation of designed RNAi reagents for ’off-target’ effects in additional databases. This options requires the location of a Bowtie database/index; the siRNA length [nt] for mappings; whether off-target effects should be evaluated by positional information (’pos’, database has to be the same as in GENOMEBOWTIE / GENOMEFASTA) or by target information (’target’ uses targetgroups defined in TARGETGROUPS). Database, siRNA length and evaluation option are separated by comma.
Multiple evaluations can be queried (default = disabled).
Mapping reagents to the ’off-target’ (-d) database
Location of file defining which sequences in the database file (-d option) belong to one group (e.g. splice variants of a gene) (default = disabled). A tab-delimited file containing the headers ’Target’ (e.g. transcripts) and ’TargetGroup’ (e.g. the gene the transcript belongs to) is required. NEXT-RNAi will then consider e.g. siRNAs that target multiple transcripts of the same gene as specific for this gene. Multiple files containing targetgroups can be defined in the options file.
Location of file containing identifiers from the off-target database (-d option) that should be excluded as target sites, but not considered as real off-targets in case they were hit (e.g. UTR regions). A text file with the header ’Exclude’ listing identifiers to be excluded is required. Multiple ’EXCLUDED’ files can be queried (default = disabled).
Location of file containing sequence identifiers from the input file connected to their intended target (same as ’TargetGroup’ identifier in TARGETGROUPS file) that forces NEXT-RNAi always to output this gene as the primary, intended target of the reagent. A tab-delimited file with the headers ’Query’ and ’Intended’ listing the identifiers is required. Multiple ’INTENDED’ files can be queried (default = disabled).
Mapping reagents to the genome using Bowtie
Set location of mapping database/index for Bowtie. Bowtie needs mapping databases (indices) that were build with the bowtie-build script from FASTA files. The mapping of RNAi reagents is a prerequisite for generation of GFF and AFF output files and for the calculation of FEATURE contents.
Mapping reagents to the genome using blat or gfClient
Set type of source where target sequences were retrieved from (’GENOMIC’ for genomic (unspliced) sources, ’CDS’ for spliced sources). It affects the type of mapping: for ’CDS’ sources BLAT is required, for ’GENOMIC’ sources BOWTIE is used (default = GENOMIC)
Set either to ’blat’ for local Blat alignments or to ’gfClient’ for alignments using a running Blat server (default = blat). The ’blat’ option requires setting of a FASTA database with the GENOMEFASTA option, the ’gfClient’ option requires BLATHOST and BLATPORT settings to connect to the Blat server.
Set location of FASTA mapping database for Blat. The mapping of RNAi reagents is a prerequisite for generation of GFF and AFF output files and for the calculation of FEATURE contents.
Name of server that runs the Blat server (gfServer), required to run Blat mappings using the BLATPROGRAM gfClient.
Port to connect to a particular instance (database) of the Blat server defined in BLATHOST. Required to run Blat mappings using the BLATPROGRAM gfClient.
Split parameter for a large FASTA database defined in GENOMEFASTA (using blat as BLATPROGRAM). The FASTA database will be splitted in parts only containing the defined number of sequences (default = 0, means no splitting).
If set to ’PERFECT’, NEXT-RNAi only allows perfect matches during mapping of sequences to the genome with blat or gfClient. If set to ’PARTIAL’, also partial mappings are evaluated (default = PERFECT).
Set location of off-target database (as used for -d option) in FASTA format. This option is required for gapped alignments using Blat (e.g. to map siRNAs spanning exon-exon boundaries). NEXT-RNAi can use the mapping information from the off-target database to extend the reagent’s sequence and re-map it to the genome (default = disabled).
Set output folder for files created by NEXT-RNAi (default location is input file location)
Enables output of general feature format (GFF) file by choosing either ’GFF2’ or ’GFF3’ format and requires prior mapping of reagents (see BOWTIE and BLAT
options) (default = disabled).
Set URL to a generic genome browser (GBrowse) instance for visualization of designed reagents in their genomic context (default = disabled). The URL needs to be a link to the ’gbrowse_img’ script of the GBrowse instance, e.g. for accessing our Drosophila melanogaster genome browser use http://www.dkfz.de/signal]ing/cgi-bin/gbrowse_img/flybase/ing/cgi-bin/gbrowse_img/flybase/]. The visualization requires prior mapping of reagents (see BOWTIE and BLAT options) and further tracks can be added by setting of
the GBROWSETRACK option.
Set generic genome browser (GBrowse) tracks to be visualized with the designed RNAi reagents (default = disabled). Multiple tracks can be enabled by ’+’ concatenation (e.g. ’GENE+TXN’ for showing genes and transcripts in our Drosophila melanogaster GBrowse, see GBROWSEBASE option). The visualization requires prior mapping of reagents (see BOWTIE and BLAT options) and setting of the GBROWSEBASE URL.
Set to ’YES’ for generation of an annotations file that allows for the direct upload of design results to GBrowse (default = disabled). This requires prior mapping of reagents (see BOWTIE and BLAT options).
Thomas Horn (firstname.lastname@example.org) and Michael Boutros (email@example.com)