mgkit.filter.gff module¶
GFF filtering
-
mgkit.filter.gff.
choose_annotation
(ann1, ann2, overlap=100, choose_func=None)[source]¶ New in version 0.1.12.
Given two
mgkit.io.gff.Annotation
, if one of of the two annotations either is contained in the other or they overlap for at least a overlap number of bases, choose_func will be applied to both. The result of choose_func is the the annotation to be discarderd. It returns None if the annotations should be both kept.- No checks are made to ensure that the two annotations are on the same
sequence and strand, as the intersect method of
mgkit.io.gff.Annotation
takes care of them.
- Parameters
ann1 – instance of
mgkit.io.gff.Annotation
ann2 – instance of
mgkit.io.gff.Annotation
overlap (int, float) – number of bases overlap that trigger the filtering
choose_func (None, func) – function that accepts ann1 and ann2 and return the one to be discarded or None if both are accepted
- Returns
returns either the
mgkit.io.gff.Annotation
to be discarded or None, which is the result of choose_func- Return type
(None, Annotation)
Note
If choose_func is None, the default function is used:
lambda a1, a2: min(a1, a2, key=lambda el: (el.dbq, el.bitscore, len(el)))
In order of importance the db quality, the bitscore and the length. The annotation with the lowest tuple value is the one to discard.
-
mgkit.filter.gff.
filter_annotations
(annotations, choose_func=None, sort_func=None, reverse=True)[source]¶ New in version 0.1.12.
Filter an iterable of
mgkit.io.gff.Annotation
instances sorted using sort_func as key in sorted and if the order is to be reverse; it then applies choose_func on all possible pair combinations, using itertools.combinations.By default choose_func is
choose_annotation()
with the default values, the list of annotation is sorted by bitscore, from the highest to the lowest value.- Parameters
annotations (iterable) – iterable of
mgkit.io.gff.Annotation
instanceschoose_func (func, None) – function used to select the losing annotation; if None, it will be
choose_annotation()
with default valuessort_func (func, None) – by default the sorting key is the bitscore of the annotations
reverse (bool) – passed to sorted, by default is reversed
- Returns
a set with the annotations that pass the filtering
- Return type
-
mgkit.filter.gff.
filter_attr_num
(annotation, attr=None, value=None, greater=True)[source]¶ Checks if an annotation attr dictionary contains a key whose value is greater than or equal, or lower than or equal, for the requested value
- Parameters
annotation –
mgkit.io.gff.Annotation
instanceattr (str) – key in the
mgkit.io.gff.Annotation.attr
dictionaryvalue (int) – the value to which we need to compare
greater (bool) – if True the value must be equal or greater than and if False equal of lower than
- Returns
True if the test passes
- Return type
-
mgkit.filter.gff.
filter_attr_num_s
(annotation, attr=None, value=None, greater=True)[source]¶ New in version 0.3.1.
Checks if an annotation attr dictionary contains a key whose value is greater or lower than the requested value
- Parameters
annotation –
mgkit.io.gff.Annotation
instanceattr (str) – key in the
mgkit.io.gff.Annotation.attr
dictionaryvalue (int) – the value to which we need to compare
greater (bool) – if True the value must be greater than and if False lower than
- Returns
True if the test passes
- Return type
-
mgkit.filter.gff.
filter_attr_str
(annotation, attr=None, value=None, equal=True)[source]¶ Checks if an annotation attr dictionary contains a key shose value is equal to, or contains the requested value
- Parameters
annotation –
mgkit.io.gff.Annotation
instanceattr (str) – key in the
mgkit.io.gff.Annotation.attr
dictionaryvalue (int) – the value to which we need to compare
equal (bool) – if True the value must be equal and if False equal value must be contained
- Returns
True if the test passes
- Return type
-
mgkit.filter.gff.
filter_base
(annotation, attr=None, value=None)[source]¶ Checks if an annotation attribute is equal to the requested value
- Parameters
annotation –
mgkit.io.gff.Annotation
instanceattr (str) – attribute of the annotation
value – the value that the attribute should be equal to
- Returns
True if the supplied value is equal to the attribute ot False otherwise
- Return type
-
mgkit.filter.gff.
filter_base_num
(annotation, attr=None, value=None, greater=True)[source]¶ Checks if an annotation attribute is greater, equal of lower than the requested value
- Parameters
annotation –
mgkit.io.gff.Annotation
instanceattr (str) – attribute of the annotation
value (int) – the value to which the attribute should be compared to
greater (bool) – if True the attribute value must be equal or greater than and if False equal of lower than
- Returns
True if the test passes
- Return type
-
mgkit.filter.gff.
filter_len
(annotation, value=None, greater=True)[source]¶ Checks if an annotation length is longer, equal of shorter than the requested value
- Parameters
annotation –
mgkit.io.gff.Annotation
instancevalue (int) – the length to which the attribute should be compared to
greater (bool) – if True the annotation length must be equal or greater than and if False equal of lower than
- Returns
True if the test passes
- Return type