Parallel Processor#
- class utils.parallel.framework.Parallelization(max_workers=8)[source]#
Bases:
objectParallelization framework for running functions in parallel. you will need to prepare the inputs and the function to run.
- Parameters
max_workers (int) – maximum number of parallel workers (default:
8)
It is a simple framework that uses the multiprocessing.Pool module to run functions in parallel. You need to prepare the inputs as a list of lists, where each inner list contains the arguments for a single function call.
Example
# Import the Parallelization class. from utils.parallel.framework import Parallelization # Create a Parallelization object and prepare the inputs. p_job = Parallelization(max_workers=4) p_job.prepare(test=True) # Print out the inputs in parallel. p_job.run(func=p_job.test_func)
- class utils.parallel.framework.TrajHandlerPreprocess(max_workers=8, logger_path: Optional[str] = None, task_name: str = 'core-dataprep')[source]#
Bases:
ParallelizationParallelization framework for preprocessing trajectory data. You will need to prepare the inputs and the function to run.
- Parameters
max_workers (int) – maximum number of parallel workers (default:
8)logger_path (str) – path to save the logger file (default:
None). IfNone, no logger will be created.task_name (str) – name of the task for the logger (default:
"core-dataprep")
Example
# Import the TrajHandlerPreprocess class. from utils.parallel.framework import TrajHandlerPreprocess # Create a TrajHandlerPreprocess object and prepare the inputs. p_job = TrajHandlerPreprocess(max_workers=4, logger_path="preprocess.log") p_job.prepare( traj_handler=traj_handler, root_path="output/p_data", filename="traj_data", frames_list=[0, 1, 2, 3, 4], ) # Print out the inputs in parallel. p_job.run(func=preprocess_workflow)
Note
The source code of the
preprocess_workflowfunction can be found inutils.datasets.general.preprocess_workflow().- prepare(traj_handler, config=None, **kwargs)[source]#
Prepare the inputs for the preprocess workflow.
- Parameters
traj_handler – MDAnalysis Universe object
config – configuration object. If
None, theroot_pathandfilenameare required in the kwargs (optional keyword arguments).root_path (str, optional keyword argument) – root path for the output files. If
None, the root path is required in the config file (output_p_data_folderpath).filename (str, optional keyword argument) – filename for the output files. If
None, the filename is required in the config file (p_filename).frames_list (list, optional keyword argument) – list of frames to process (e.g.
[1, 2, 3]). IfNone, all frames will be processed.index_path (str, optional keyword argument) – path to the index file. If not provided, the one found in the config file (
output_index_path) will be used. Otherwise, it will be generated by[root_path]/[filename]_index.txt.
- class utils.parallel.framework.TrajHandlerPrediction(max_workers=8, logger_path: Optional[str] = None, task_name: str = 'core-predict')[source]#
Bases:
TrajHandlerPreprocessParallelization framework for prediction workflow. You will need to prepare the inputs and the function to run.
- Parameters
max_workers (int) – maximum number of parallel workers (default:
8)logger_path (str) – path to save the logger file (default:
None). IfNone, no logger will be created.task_name (str) – name of the task for the logger (default:
"core-predict")
Example
# Import the TrajHandlerPrediction class. from utils.parallel.framework import TrajHandlerPrediction # Create a TrajHandlerPrediction object and prepare the inputs. p_job = TrajHandlerPrediction(max_workers=4, logger_path="predict.log") p_job.prepare( traj_handler=traj_handler, root_path="output/p_data", filename="traj_data", frames_list=[0, 1, 2, 3, 4], ) # Setup the function. p_job.set_function( func=add_prediction_to_ply, model_path="model/best_model.pt", ) # Run the prediction in parallel. p_job.run()
Note
The source code of the
add_prediction_to_plyfunction can be found inutils.datasets.general.add_prediction_to_ply().- prepare(traj_handler, config=None, **kwargs)[source]#
Prepare the inputs for the preprocess workflow.
- Parameters
traj_handler – MDAnalysis Universe object
config – configuration object. If
None, theroot_pathandfilenameare required in the kwargs (optional keyword arguments).root_path (str, optional keyword argument) – root path for the output files. If
None, the root path is required in the config file (output_p_data_folderpath).filename (str, optional keyword argument) – filename for the output files. If
None, the filename is required in the config file (p_filename).frames_list (list, optional keyword argument) – list of frames to process (e.g.
[1, 2, 3]). IfNone, all frames will be processed.index_path (str, optional keyword argument) – path to the index file. If not provided, the one found in the config file (
output_index_path) will be used. Otherwise, it will be generated by[root_path]/[filename]_index.txt.
- class utils.parallel.framework.TrajHandlerVisualization(max_workers=8, logger_path: Optional[str] = None, task_name: str = 'core-vis')[source]#
Bases:
TrajHandlerPreprocessParallelization framework for visualization workflow. You will need to prepare the inputs and the function to run.
- Parameters
max_workers (int) – maximum number of parallel workers (default:
8)logger_path (str) – path to save the logger file (default:
None). IfNone, no logger will be created.task_name (str) – name of the task for the logger (default:
"core-vis")
Example
# Import the TrajHandlerVisualization class. from utils.parallel.framework import TrajHandlerVisualization # Create a TrajHandlerVisualization object and prepare the inputs. p_job = TrajHandlerVisualization(max_workers=4) p_job.prepare( traj_handler=traj_handler, root_path="output/p_data", filename="traj_data", frames_list=[0, 1, 2, 3, 4], ) # Setup the function. p_job.set_function(func=generate_pse, pymol_path="path/to/pymol") # Run the visualization in parallel. p_job.run()
Note
The source code of the
generate_psefunction can be found inutils.pymol_scripts.vis_pdb_ply.generate_pse().- prepare(traj_handler, config=None, **kwargs)[source]#
Prepare the inputs for the preprocess workflow.
- Parameters
traj_handler – MDAnalysis Universe object
config – configuration object. If
None, theroot_pathandfilenameare required in the kwargs (optional keyword arguments).root_path (str, optional keyword argument) – root path for the output files. If
None, the root path is required in the config file (output_p_data_folderpath).filename (str, optional keyword argument) – filename for the output files. If
None, the filename is required in the config file (p_filename).frames_list (list, optional keyword argument) – list of frames to process (e.g.
[1, 2, 3]). IfNone, all frames will be processed.index_path (str, optional keyword argument) – path to the index file. If not provided, the one found in the config file (
output_index_path) will be used. Otherwise, it will be generated by[root_path]/[filename]_index.txt.