mgkit.snps.conv_func module

Wappers to use some of the general function of the snps package in a simpler way.

mgkit.snps.conv_func.get_full_dataframe(snp_data, taxonomy, min_num=3, index_type=None, filters=None)[source]

New in version 0.1.12.

Changed in version 0.2.2: added filters argument

Returns a DataFrame with the pN/pS of the given SNPs data.

Shortcut for using combine_sample_snps(), using filters from get_default_filters().

Parameters
  • snp_data (dict) – dictionary sample->GeneSyn of SNPs data

  • taxonomy – Uniprot Taxonomy

  • min_num (int) – minimum number of samples in which a valid pN/pS is found

  • index_type (str, None) – type of index to return

  • filters (iterable) – list of filters to apply, otherwise uses the default filters

Returns

pandas.DataFrame of pN/pS values. The index type is None (gene-taxon)

Return type

DataFrame

mgkit.snps.conv_func.get_gene_map_dataframe(snp_data, taxonomy, gene_map, min_num=3, index_type='gene', filters=None)[source]

New in version 0.1.11.

Changed in version 0.2.2: added filters argument

Returns a DataFrame with the pN/pS of the given SNPs data, mapping all taxa to the gene map.

Shortcut for using combine_sample_snps(), using filters from get_default_filters() and as gene_func parameter map_gene_id().

Parameters
  • snp_data (dict) – dictionary sample->GeneSyn of SNPs data

  • taxonomy – Uniprot Taxonomy

  • min_num (int) – minimum number of samples in which a valid pN/pS is found

  • gene_map (dict) – dictionary of mapping for the gene_ids in in SNPs data

  • index_type (str, None) – type of index to return

  • filters (iterable) – list of filters to apply, otherwise uses the default filters

Returns

pandas.DataFrame of pN/pS values. The index type is ‘gene’

Return type

DataFrame

mgkit.snps.conv_func.get_gene_taxon_dataframe(snp_data, taxonomy, gene_map, min_num=3, rank='genus', index_type=None, filters=None, use_uid=False)[source]

New in version 0.1.12.

Changed in version 0.2.2: added filters argument

Changed in version 0.5.1: gene_map can be None, use_uid can be passed to the underline function

Todo

edit docstring

Returns a DataFrame with the pN/pS of the given SNPs data, mapping all taxa to the gene map.

Shortcut for using combine_sample_snps(), using filters from get_default_filters() and as gene_func parameter map_gene_id().

Parameters
  • snp_data (dict) – dictionary sample->GeneSyn of SNPs data

  • taxonomy – Uniprot Taxonomy

  • min_num (int) – minimum number of samples in which a valid pN/pS is found

  • gene_map (dict) – dictionary of mapping for the gene_ids in in SNPs data

  • index_type (str, None) – type of index to return

  • filters (iterable) – list of filters to apply, otherwise uses the default filters

  • use_uid (bool) – instead of using gene_id, uses uid as gene ID

Returns

pandas.DataFrame of pN/pS values. The index type is ‘gene’

Return type

DataFrame

mgkit.snps.conv_func.get_rank_dataframe(snp_data, taxonomy, min_num=3, rank='order', index_type='taxon', filters=None)[source]

New in version 0.1.11.

Changed in version 0.2.2: added filters argument

Returns a DataFrame with the pN/pS of the given SNPs data, mapping all taxa to the specified rank. Higher taxa won’t be included.

Shortcut for using combine_sample_snps(), using filters from get_default_filters() and as taxon_func parameter map_taxon_id_to_rank(), with include_higher equals to False

Parameters
  • snp_data (dict) – dictionary sample->GeneSyn of SNPs data

  • taxonomy – Uniprot Taxonomy

  • min_num (int) – minimum number of samples in which a valid pN/pS is found

  • rank (str) – taxon rank to map. Valid ranks are found in mgkit.taxon.TAXON_RANKS

  • index_type (str, None) – type of index to return

  • filters (iterable) – list of filters to apply, otherwise uses the default filters

Returns

pandas.DataFrame of pN/pS values. The index type is ‘taxon’

Return type

DataFrame