Package CHEM :: Package CombiCDB :: Module MolecularWeightPools :: Class MolecularWeightPools
[hide private]
[frames] | no frames]

Class MolecularWeightPools



Given a collection of molecules, calculate the molecular weight 
(atomic units) of each.  Once this is done, create pools of <poolSize> 
molecules where the molecular weight of the molecules in each pool is 
spaced as far as possible.  Output the molecules back, including 
their molecular weight and the index of the pool they were assigned to.

For example, if there were molecules named (A,B,C,D,E) with weights
(1,2,3,4,5) and a poolSize = 2, then this should output

A   1   0
D   4   0
B   2   1
E   5   1
C   3   2

Input: 
- sourceFile:  Molecule file
    Can be any format understandable by oemolistream, assuming a properly 
    named extension.  For example, "molecules.smi" for SMILES format
    Can take stdin as source by specifying the filename "-" or ".smi" or 
    something similar.  See documentation of oemolistream for more information.

- poolSize:  Integer
    Size of the pools to generate.  All pools should be of this size, except for
    the last one, which might be less.  If set to 0 or a negative number,
    then assume no pools are to be generated, just output the whole
    original list with the molecular weights, in order by weight.

Output:
- poolFile:  Molecule file
    Molecules from source output again, but the title / label of the molecules
    will have appended the molecule's molecular weight and the index of the
    pool they have been assigned to.  These will be tab-delimited, thus, if
    the output is simple SMILES, the results should be easily opened
    in a spreadsheet program (i.e. Excel) additional sorting by pool #, etc.



Instance Methods [hide private]
 
definePoolsByFilename(self, sourceFilename, poolSize, poolFilename)
Opens files with respective names and delegates most work to "definePools"
 
definePools(self, sourceOEIS, poolSize, poolOEOS)
Primary method, reads the source file to generate the molecular weight distributed pools.
Method Details [hide private]

definePools(self, sourceOEIS, poolSize, poolOEOS)

 

Primary method, reads the source file to generate the molecular weight distributed pools. See module documentation for more information.

Note: This method actually takes oemolistream and oemolostream objects, not filenames, to allow the caller to pass "virtual Files" for the purpose of testing and interfacing. Use the "main" method to have the module take care of opening files from filenames.