Holo Descriptor and Analyzer#

class utils.ppseg.holo_descriptor.holo_descriptor.HoloDescriptor(ply_path)[source]#

Bases: object

HoloDescriptor class is used to calculate holo descriptors for a conformation based on the predictions from the deep-learning model.

Parameters

ply_path – str, path to the .ply file. Required pred and predprobs in the attrubute.

Example

from holo_descriptor import HoloDescriptor

holo_descriptor = HoloDescriptor(ply_path)
holo_descriptor.run()
holo_descriptor.save(json_path)
run()[source]#

Run to extract holo descriptors

Parameters

None

results#

Dictionary containing the results of the descriptors.

Type

dict

save(json_path)[source]#

Save the results to a json file

Parameters

json_path (str) – Path to the json file.

utils.ppseg.holo_descriptor.holo_descriptor.save_descriptors(json_path, **kwargs)[source]#

Save the descriptors to a json file

Parameters
  • json_path (str) – Path to the json file.

  • kwargs – Dictionary containing the descriptors.

utils.ppseg.holo_descriptor.holo_descriptor.convert_numpy_types(obj)[source]#

Convert numpy types to native python types

Parameters

obj – Object to be converted

Returns

Converted object

Return type

obj

utils.ppseg.holo_descriptor.holo_descriptor.read_descriptors(json_path)[source]#

Read the descriptors from a json file

Parameters

json_path (str) – Path to the json file.

Returns

Dictionary containing the descriptors.

Return type

dict

class utils.ppseg.holo_descriptor.holo_descriptor.HoloDescriptorAnalyser(source_path, frag_info_path: Optional[str] = None)[source]#

Bases: object

HoloDescriptorAnalyser class is used to analyze the holo descriptors for a conformation based on the predictions from the deep-learning model.

Parameters
  • source_path – str, path to the folder containing the json files.

  • frag_info_path – str, path to the fragment information json file.

source_path#

Path to the folder containing the holo-descriptor json files.

Type

str

frag_info_path#

Path to the fragment information json file.

Type

str

files#

List of json files in the source path (after list_files).

Type

list

descriptors_df#

A curated DataFrame containing the descriptors from the json files (after read()). descriptors_df will be added with the {colname}_zscore column (after calculate_zscore()). descriptors_df will be added with the overall_score and rank columns (after set_rank()). descriptors_df will be added with the holospace_frag_score column (after holospace_frag_score()).

Type

pd.DataFrame

holospace_frag_volumes#

DataFrame containing the holospace fragment volumes (after extract_holospace_frag_volume()). None means using the default fragment information file.

Type

pd.DataFrame

A normal workflow for analyzing holo descriptors

  1. Create an instance of the HoloDescriptorAnalyser class

  2. List the files in the source path (list_files())

  3. Read the descriptors from the json files (read())

  4. Calculate the zscore of the column (calculate_zscore())

  5. Set the rank of the conformations (set_rank())

  6. Get the top n conformations (top_n())

  7. (optional) Extract the holospace fragment volumes (extract_holospace_frag_volume())

Example

from holo_descriptor import HoloDescriptorAnalyser

# Create an instance of the HoloDescriptorAnalyser class
holo_descriptor_analyser = HoloDescriptorAnalyser(
                                source_path, frag_info_path
                            )

# List the files in the source path
holo_descriptor_analyser.list_files()

# Read the descriptors from the json files
holo_descriptor_analyser.read()

# Calculate the zscore of the column
holo_descriptor_analyser.calculate_zscore("holospace_volume")

# Set the rank of the conformations
holo_descriptor_analyser.set_rank()

# Get the top 5 conformations
holo_descriptor_analyser.top_n(5)

# (optional) Extract the holospace fragment volumes
holo_descriptor_analyser.extract_holospace_frag_volume()
list_files()[source]#

List the files in the source path (*.json)

read(holospace_calc=False)[source]#

Read the descriptors from the json files

Parameters

holospace_calc (bool) – calculate the holospace fragment score at the same time, which might need more time (default: False)

Returns

DataFrame containing the descriptors

Return type

pd.DataFrame

extract_holospace_frag_volume(num_frag=6)[source]#

Extract the holospace fragment volumes (if needed)

Parameters

num_frag (int) – Number of fragments (default: 6)

Returns

DataFrame containing the holospace

fragment volumes

Return type

pd.DataFrame

calculate_zscore(colname, use_presets: Optional[str] = None)[source]#

Calculate the zscore of the column (need specify the column name)

Parameters
  • colname (str) – Column name to be calculated

  • use_presets (str) – Use presets pr or pps (“pr” for post-rigor state myosin; “pps” for pre-powerstroke state myosin) for mean and std (default: None). If None, the mean and std will be calculated from the data.

set_rank(weights: Optional[ndarray] = None, zscore_columns: Optional[list] = None, filter_warning=True)[source]#

Set the rank of the conformations (based on the zscore columns)

Parameters
  • weights (np.ndarray) – weights for the zscore columns (default = None, equal weights for all zscore columns)

  • zscore_columns (list) – zscore columns to be used: aligned to weights, (default = None, use all zscore columns in the data frame)

  • filter_warning (bool) – sort by warnings first

top_n(n=5)[source]#

Get the top n conformations.

Parameters

n (int) – Number of top conformations (default: 5)

Returns

DataFrame containing the top n conformations

Return type

pd.DataFrame

load_frag_info(frag_info_path)[source]#

Load the fragment information

Parameters

frag_info_path (str) – Path to the fragment information json file.

Returns

Dictionary containing the fragment information

Note

  • self.fragment_info: dict

  • self.fragment_vol: np.ndarray (fragment volumes)

Return type

dict

holospace_frag_score()[source]#

Calculate the holospace fragment score

Returns

DataFrame containing the holospace

fragment score (holospace_frag_score)

Return type

pd.DataFrame