fasta-utils - Fasta Utilities¶
Overview¶
New in version 0.3.0.
Scripts that includes some functionality to help use FASTA files with the framework
split command¶
Used to split a fasta file into smaller fragments
translate command¶
Used to translate nucleotide sequences into amino acids.
uid command¶
Used to change a FASTA file headers to a unique ID. A table (tab separated) with the changes made can be kept, using the –table option.
filter¶
Used to filter a FASTA file by length and also for sequence/header if a pattern is contained. A list of headers to keep can be passed using the -f option.
info¶
Gets information about a FASTA file, prints seq_id (trimmed at first space), length and hash (default sha1) and optionally the sequence, GC content and in GFF format if wanted.
rename¶
Renames the headers of a FASTA file, appending a random suffix and an optional prefix
Changes¶
New in version 0.3.0.
Changed in version 0.3.1: added translate and uid command
Changed in version 0.3.4: ported to click
Changed in version 0.5.5: added option -1 to output only the forward/frame0 and -w to avoid wrap at 60 chars to the translate command
Changed in version 0.5.7: added filter and info commands for simple fasta file filtering and info
Options¶
fasta-utils¶
Main function
fasta-utils [OPTIONS] COMMAND [ARGS]...
Options
-
--version
¶
Show the version and exit.
-
--cite
¶
filter¶
Filters a FASTA file [file-file]
fasta-utils filter [OPTIONS] [FASTA_FILE] [OUTPUT_FILE]
Options
-
-v
,
--verbose
¶
-
--len-gt
<len_gt>
¶ Keeps sequences whose length is greater than
-
--len-lt
<len_lt>
¶ Keeps sequences whose length is less than
-
--header-contains
<header_contains>
¶ Keeps sequences whose header contains the string
-
--seq-pattern
<seq_pattern>
¶ Keeps sequences that contains the string
-
-f
,
--header-file
<header_file>
¶ Keep only sequences contained in file list
-
-w
,
--wrap
¶
Wraps the output sequences to 60 characters
-
-s
,
--trim-tail
¶
Removes header information after first space
Arguments
-
FASTA_FILE
¶
Optional argument
-
OUTPUT_FILE
¶
Optional argument
info¶
Gets information of FASTA file [file-file]
fasta-utils info [OPTIONS] [FASTA_FILE] [OUTPUT_FILE]
Options
-
-v
,
--verbose
¶
-
-h
,
--header
¶
Prints header
-
-s
,
--include-seq
¶
Includes the sequence
-
-r
,
--no-rename
¶
Do not split sequence name at first space
-
-a
,
--hash-type
<hash_type>
¶ - Default
sha1
- Options
sha1|md5|sha256
-
-g
,
--out-gff
¶
Outputs a GFF file
- Default
False
-
-gc
,
--gc-content
¶
Includes the GC Content
- Default
False
Arguments
-
FASTA_FILE
¶
Optional argument
-
OUTPUT_FILE
¶
Optional argument
rename¶
Rename Sequence headers of FASTA file [file-file] Adds 2 possible elements to the sequence header, separated by a character 1) a suffix (random string of characters) and 2) a prefix (optional).
The character used as separator should be a ‘|’ (default), ‘#’ or other character that is not truncated in other software (space is).
In fact, this script will truncate the header at the first space
fasta-utils rename [OPTIONS] [FASTA_FILE] [OUTPUT_FILE]
Options
-
-v
,
--verbose
¶
-
-p
,
--prefix
<prefix>
¶ Adds a prefix to the header
-
-f
,
--file-name
¶
Adds filename as prefix (Useful for adding the file name
-
-s
,
--separator
<separator>
¶ Separator for the elements of the new header
-
-l
,
--suffix-len
<suffix_len>
¶ Number of random characters to use
Arguments
-
FASTA_FILE
¶
Optional argument
-
OUTPUT_FILE
¶
Optional argument
split¶
Splits a FASTA file [fasta-file] in a number of fragments
fasta-utils split [OPTIONS] [FASTA_FILE]
Options
-
-v
,
--verbose
¶
-
-p
,
--prefix
<prefix>
¶ Prefix for the file name in output
- Default
split
-
-n
,
--number
<number>
¶ Number of chunks into which split the FASTA file
- Default
10
-
-z
,
--gzip
¶
gzip output files
Arguments
-
FASTA_FILE
¶
Optional argument
translate¶
Translate FASTA file [fasta-file] in all 6 frames to [output-file]
fasta-utils translate [OPTIONS] [FASTA_FILE] [OUTPUT_FILE]
Options
-
-v
,
--verbose
¶
-
-t
,
--trans-table
<trans_table>
¶ translation table
- Default
universal
- Options
bac_plt|drs_mit|inv_mit|prt_mit|universal|vt_mit|yst_alt|yst_mit
-
-1
,
--one-seq
¶
Only translate the sequence, instead of all 6 frames
- Default
False
-
-w
,
--no-wrap
¶
Make a sequence use only 1 line (2 including header)
- Default
False
-
--progress
¶
Shows Progress Bar
Arguments
-
FASTA_FILE
¶
Optional argument
-
OUTPUT_FILE
¶
Optional argument
uid¶
Changes each header of a FASTA file [file-file] to a uid (unique ID)
fasta-utils uid [OPTIONS] [FASTA_FILE] [OUTPUT_FILE]
Options
-
-v
,
--verbose
¶
-
-t
,
--table
<table>
¶ Filename of a table to record the changes (by default discards it)
Arguments
-
FASTA_FILE
¶
Optional argument
-
OUTPUT_FILE
¶
Optional argument