mgkit.io.fasta module¶
Simple fasta parser and a few utility functions
-
mgkit.io.fasta.
load_fasta
(file_handle)[source]¶ Changed in version 0.1.13: now returns uppercase sequences
Loads a fasta file and returns a generator of tuples in which the first element is the name of the sequence and the second the sequence
- Parameters
file_handle (str, file) – fasta file to open; a file name or a file handle is expected
- Yields
tuple – first element is the sequence name/header, the second element is the sequence
-
mgkit.io.fasta.
load_fasta_files
(files)[source]¶ New in version 0.3.4.
Loads all fasta files from a list or iterable
-
mgkit.io.fasta.
load_fasta_prodigal
(file_handle)[source]¶ New in version 0.3.1.
Reads a Prodigal aminoacid fasta file and yields a dictionary with basic information about the sequences.
- Parameters
file_handle (str, file) – passed to
load_fasta()
- Yields
dict – dictionary with the information contained in the header, the last of the attributes put into key attr, while the rest are transformed to other keys: seq_id, seq, start, end (genomic), strand, ordinal of
-
mgkit.io.fasta.
load_fasta_rename
(file_handle, name_func=None)[source]¶ New in version 0.3.1.
Renames the header of the sequences using name_func, which is called on each header. By default, the behaviour is to keep the header to the left of the first space (BLAST behaviour).
-
mgkit.io.fasta.
split_fasta_file
(file_handle, name_mask, num_files)[source]¶ New in version 0.1.13.
Splits a fasta file into a series of smaller files.