mgkit.filter.gff module

GFF filtering

mgkit.filter.gff.choose_annotation(ann1, ann2, overlap=100, choose_func=None)[source]

New in version 0.1.12.

Given two mgkit.io.gff.Annotation, if one of of the two annotations either is contained in the other or they overlap for at least a overlap number of bases, choose_func will be applied to both. The result of choose_func is the the annotation to be discarderd. It returns None if the annotations should be both kept.

No checks are made to ensure that the two annotations are on the same

sequence and strand, as the intersect method of mgkit.io.gff.Annotation takes care of them.

Parameters
  • ann1 – instance of mgkit.io.gff.Annotation

  • ann2 – instance of mgkit.io.gff.Annotation

  • overlap (int, float) – number of bases overlap that trigger the filtering

  • choose_func (None, func) – function that accepts ann1 and ann2 and return the one to be discarded or None if both are accepted

Returns

returns either the mgkit.io.gff.Annotation to be discarded or None, which is the result of choose_func

Return type

(None, Annotation)

Note

If choose_func is None, the default function is used:

lambda a1, a2: min(a1, a2, key=lambda el: (el.dbq, el.bitscore,
                   len(el)))

In order of importance the db quality, the bitscore and the length. The annotation with the lowest tuple value is the one to discard.

mgkit.filter.gff.filter_annotations(annotations, choose_func=None, sort_func=None, reverse=True)[source]

New in version 0.1.12.

Filter an iterable of mgkit.io.gff.Annotation instances sorted using sort_func as key in sorted and if the order is to be reverse; it then applies choose_func on all possible pair combinations, using itertools.combinations.

By default choose_func is choose_annotation() with the default values, the list of annotation is sorted by bitscore, from the highest to the lowest value.

Parameters
  • annotations (iterable) – iterable of mgkit.io.gff.Annotation instances

  • choose_func (func, None) – function used to select the losing annotation; if None, it will be choose_annotation() with default values

  • sort_func (func, None) – by default the sorting key is the bitscore of the annotations

  • reverse (bool) – passed to sorted, by default is reversed

Returns

a set with the annotations that pass the filtering

Return type

set

mgkit.filter.gff.filter_attr_num(annotation, attr=None, value=None, greater=True)[source]

Checks if an annotation attr dictionary contains a key whose value is greater than or equal, or lower than or equal, for the requested value

Parameters
Returns

True if the test passes

Return type

bool

mgkit.filter.gff.filter_attr_num_s(annotation, attr=None, value=None, greater=True)[source]

New in version 0.3.1.

Checks if an annotation attr dictionary contains a key whose value is greater or lower than the requested value

Parameters
Returns

True if the test passes

Return type

bool

mgkit.filter.gff.filter_attr_str(annotation, attr=None, value=None, equal=True)[source]

Checks if an annotation attr dictionary contains a key shose value is equal to, or contains the requested value

Parameters
Returns

True if the test passes

Return type

bool

mgkit.filter.gff.filter_base(annotation, attr=None, value=None)[source]

Checks if an annotation attribute is equal to the requested value

Parameters
  • annotationmgkit.io.gff.Annotation instance

  • attr (str) – attribute of the annotation

  • value – the value that the attribute should be equal to

Returns

True if the supplied value is equal to the attribute ot False otherwise

Return type

bool

mgkit.filter.gff.filter_base_num(annotation, attr=None, value=None, greater=True)[source]

Checks if an annotation attribute is greater, equal of lower than the requested value

Parameters
  • annotationmgkit.io.gff.Annotation instance

  • attr (str) – attribute of the annotation

  • value (int) – the value to which the attribute should be compared to

  • greater (bool) – if True the attribute value must be equal or greater than and if False equal of lower than

Returns

True if the test passes

Return type

bool

mgkit.filter.gff.filter_len(annotation, value=None, greater=True)[source]

Checks if an annotation length is longer, equal of shorter than the requested value

Parameters
  • annotationmgkit.io.gff.Annotation instance

  • value (int) – the length to which the attribute should be compared to

  • greater (bool) – if True the annotation length must be equal or greater than and if False equal of lower than

Returns

True if the test passes

Return type

bool