mgkit.mappings.enzyme module

New in version 0.1.14.

EC mappings

mgkit.mappings.enzyme.ENZCLASS_REGEX = '^(\\d)\\. ?([\\d-]+)\\. ?([\\d-]+)\\. ?([\\d-]+) +(.+)\\.'

Used to get the description for the higher level enzyme classes from the file enzclass.txt on expasy

mgkit.mappings.enzyme.LEVEL1_NAMES = {1: 'oxidoreductases', 2: 'transferases', 3: 'hydrolases', 4: 'lyases', 5: 'isomerases', 6: 'ligases'}

Top level classification names

mgkit.mappings.enzyme.change_mapping_level(ec_map, level=3)[source]

New in version 0.1.14.

Given a dictionary, whose values are dictionaries, in which a key is named ec and its value is an iterable of EC numbers, returns an iterator that can be used to build a dictionary with the same top level keys and the values are sets of the transformed EC numbers.

Parameters
Yields

tuple – a tuple (gene_id, set(ECs)), which can be passed to dict to make a dictionary

Example

>>> from mgkit.net.uniprot import get_gene_info
>>> from mgkit.mappings.enzyme import change_mapping_level
>>> ec_map = get_gene_info('Q9HFQ1', columns='ec')
{'Q9HFQ1': {'ec': '1.1.3.4'}}
>>> dict(change_mapping_level(ec_map, level=2))
{'Q9HFQ1': {'1.1'}}
mgkit.mappings.enzyme.get_enzyme_full_name(ec_id, ec_names, sep=', ')[source]

New in version 0.2.1.

From a EC identifiers and a dictionary of names builds a comma separated name (by default) that identifies the function of the enzyme.

Parameters
  • ec_id (str) – EC identifier

  • ec_names (dict) – a dictionary of names that can be produced using parse_expasy_file()

  • sep (str) – string used to join the names

Returns

the enzyme classification name

Return type

str

mgkit.mappings.enzyme.get_enzyme_level(ec, level=4)[source]

New in version 0.1.14.

Returns an enzyme class at a specific level , between 1 and 4 (by default the most specific, 4)

Parameters
  • ec (str) – a string representing an EC number (e.g. 1.2.4.10)

  • level (int) – from 1 to 4, to get a different level specificity of in the enzyme classification

Returns

the EC number at the requested specificity

Return type

str

Example

>>> from mgkit.mappings.enzyme import get_enzyme_level
>>> get_enzyme_level('1.1.3.4', 1)
'1'
>>> get_enzyme_level('1.1.3.4', 2)
'1.1'
>>> get_enzyme_level('1.1.3.4', 3)
'1.1.3'
>>> get_enzyme_level('1.1.3.4', 4)
'1.1.3.4'
mgkit.mappings.enzyme.get_mapping_level(ec_map, level=3)[source]

New in version 0.3.0.

Given a dictionary, whose values are iterable of EC numbers, returns an iterator that can be used to build a dictionary with the same top level keys and the values are sets of the transformed EC numbers.

Parameters
  • ec_map (dict) – dictionary genes to EC

  • level (int) – number from 1 to 4, to specify the level of the mapping, passed to get_enzyme_level()

Yields

tuple – a tuple (gene_id, set(ECs)), which can be passed to dict to make a dictionary

mgkit.mappings.enzyme.parse_expasy_dat(expasy_dat, keep_empty=False, skip_comments=True, skip_codes=None)[source]

New in version 0.4.2.

Parses the information in enzyme.dat file in expasy, a flat file containting the information about the enzyme classification.

It can be downloaded at: ftp://ftp.expasy.org/databases/enzyme/enzyme.dat

Parameters
  • expasy_dat (str) – file name or handle to an expasy.dat file

  • keep_empty (bool) – section that are empty are removed by default

  • skip_comments (bool) – used to avoid returning comments (lines starting) with CC in the file

  • skip_codes (set, tuple) – set or tuple or list to skip specific parts of the file, like skip_comments

Yields

dict – dictionary with each entry in the file, where the keys are the codes and the values are the lines included in the file

mgkit.mappings.enzyme.parse_expasy_dat_section(expasy_dat_section, skip_comments=True, skip_codes=None)[source]

New in version 0.4.2.

Parses an entry of the enzyme.dat file in expasy, used internally by mgkit.mappings.enzyme.parse_expasy_dat(), with the other arguments being passed over from it.

Returns

dictionary with the entry, with keys being the codes of the entry and the values the lines

Return type

dict

mgkit.mappings.enzyme.parse_expasy_file(file_name)[source]

Changed in version 0.4.2: changed to work on python 3.x

Used to load enzyme descriptions from the file enzclass.txt on expasy.

The FTP url for enzclass.txt is: ftp://ftp.expasy.org/databases/enzyme/enzclass.txt