fasta-utils - Fasta Utilities¶
Overview¶
New in version 0.3.0.
Scripts that includes some functionality to help use FASTA files with the framework
split command¶
Used to split a fasta file into smaller fragments
translate command¶
Used to translate nucleotide sequences into amino acids.
uid command¶
Used to change a FASTA file headers to a unique ID. A table (tab separated) with the changes made can be kept, using the –table option.
filter¶
Used to filter a FASTA file by length and also for sequence/header if a pattern is contained. A list of headers to keep can be passed using the -f option.
info¶
Gets information about a FASTA file, prints seq_id (trimmed at first space), length and hash (default sha1) and optionally the sequence, GC content and in GFF format if wanted.
rename¶
Renames the headers of a FASTA file, appending a random suffix and an optional prefix
Changes¶
New in version 0.3.0.
Changed in version 0.3.1: added translate and uid command
Changed in version 0.3.4: ported to click
Changed in version 0.5.5: added option -1 to output only the forward/frame0 and -w to avoid wrap at 60 chars to the translate command
Changed in version 0.5.7: added filter and info commands for simple fasta file filtering and info
Options¶
fasta-utils¶
Main function
fasta-utils [OPTIONS] COMMAND [ARGS]...
Options
-
--version¶ Show the version and exit.
-
--cite¶
filter¶
Filters a FASTA file [file-file]
fasta-utils filter [OPTIONS] [FASTA_FILE] [OUTPUT_FILE]
Options
-
-v,--verbose¶
-
--len-gt<len_gt>¶ Keeps sequences whose length is greater than
-
--len-lt<len_lt>¶ Keeps sequences whose length is less than
-
--header-contains<header_contains>¶ Keeps sequences whose header contains the string
-
--seq-pattern<seq_pattern>¶ Keeps sequences that contains the string
-
-f,--header-file<header_file>¶ Keep only sequences contained in file list
-
-w,--wrap¶ Wraps the output sequences to 60 characters
-
-s,--trim-tail¶ Removes header information after first space
Arguments
-
FASTA_FILE¶ Optional argument
-
OUTPUT_FILE¶ Optional argument
info¶
Gets information of FASTA file [file-file]
fasta-utils info [OPTIONS] [FASTA_FILE] [OUTPUT_FILE]
Options
-
-v,--verbose¶
-
-h,--header¶ Prints header
-
-s,--include-seq¶ Includes the sequence
-
-r,--no-rename¶ Do not split sequence name at first space
-
-a,--hash-type<hash_type>¶ - Default
sha1
- Options
sha1|md5|sha256
-
-g,--out-gff¶ Outputs a GFF file
- Default
False
-
-gc,--gc-content¶ Includes the GC Content
- Default
False
Arguments
-
FASTA_FILE¶ Optional argument
-
OUTPUT_FILE¶ Optional argument
rename¶
Rename Sequence headers of FASTA file [file-file] Adds 2 possible elements to the sequence header, separated by a character 1) a suffix (random string of characters) and 2) a prefix (optional).
The character used as separator should be a ‘|’ (default), ‘#’ or other character that is not truncated in other software (space is).
In fact, this script will truncate the header at the first space
fasta-utils rename [OPTIONS] [FASTA_FILE] [OUTPUT_FILE]
Options
-
-v,--verbose¶
-
-p,--prefix<prefix>¶ Adds a prefix to the header
-
-f,--file-name¶ Adds filename as prefix (Useful for adding the file name
-
-s,--separator<separator>¶ Separator for the elements of the new header
-
-l,--suffix-len<suffix_len>¶ Number of random characters to use
Arguments
-
FASTA_FILE¶ Optional argument
-
OUTPUT_FILE¶ Optional argument
split¶
Splits a FASTA file [fasta-file] in a number of fragments
fasta-utils split [OPTIONS] [FASTA_FILE]
Options
-
-v,--verbose¶
-
-p,--prefix<prefix>¶ Prefix for the file name in output
- Default
split
-
-n,--number<number>¶ Number of chunks into which split the FASTA file
- Default
10
-
-z,--gzip¶ gzip output files
Arguments
-
FASTA_FILE¶ Optional argument
translate¶
Translate FASTA file [fasta-file] in all 6 frames to [output-file]
fasta-utils translate [OPTIONS] [FASTA_FILE] [OUTPUT_FILE]
Options
-
-v,--verbose¶
-
-t,--trans-table<trans_table>¶ translation table
- Default
universal
- Options
bac_plt|drs_mit|inv_mit|prt_mit|universal|vt_mit|yst_alt|yst_mit
-
-1,--one-seq¶ Only translate the sequence, instead of all 6 frames
- Default
False
-
-w,--no-wrap¶ Make a sequence use only 1 line (2 including header)
- Default
False
-
--progress¶ Shows Progress Bar
Arguments
-
FASTA_FILE¶ Optional argument
-
OUTPUT_FILE¶ Optional argument
uid¶
Changes each header of a FASTA file [file-file] to a uid (unique ID)
fasta-utils uid [OPTIONS] [FASTA_FILE] [OUTPUT_FILE]
Options
-
-v,--verbose¶
-
-t,--table<table>¶ Filename of a table to record the changes (by default discards it)
Arguments
-
FASTA_FILE¶ Optional argument
-
OUTPUT_FILE¶ Optional argument