Abstract base class for all kernel classes. Kernel classes are those
that can take a pair of of objects an calculate some similarity score
between them for use in a support vector machine (SVM) style
machine-learning application.
These objects need not be of any particular type as far as this
interface is concerned. They may be a pair of strings, molecules
(OEMolBase), vectors, etc. It is up to the implementing class to make
those distinctions.
Ultimately, the scores generated from these kernels will probably be
used to build a "Gram matrix" of scores on a list of source
objects against itself. This abstract class provides a convenience
methods for generating this matrix given an iterator factory for the
list, outputting it as a tab-delimited file.
Object iterator factories, that is, an object that can produce fresh
iterators over the object list, must be used rather than simple iterators
because nested loops will be used to iterate over the objects multiple
times. Thus, for example, if a file object was used, this would be a
problem since, after the first iteration, the end-of-file would be
reached. The Common.IteratorFactory module contains a couple classes for
generating such factories from common source types (files, arrays,
oemolistream).
|
similarity(self,
obj0,
obj1)
Primary abstract method where, given two objects, should return an
appropriate, non-negative, similarity score between the two. |
|
|
|
dictionaryDotProduct(self,
featureDict1,
featureDict2)
Given two dictionaries, treat these like vectors and take the
"dot-product" between them. |
|
|
|
dictionaryEuclideanDistanceSquared(self,
featureDict1,
featureDict2)
Given two dictionaries, treat these like vectors and calculate the
Euclidean distance between them, squared. |
|
|
|
|
|
getFeatureDictionary(self,
obj,
objIndex)
See if a feature dictionary has already been created for the
object at the specified objIndex. |
|
|
|
normalizeFeatureDictionary(self,
featureDict)
Given a dictionary, interpret it as a feature vector, whose values
are some numerical value. |
|
|
|
ensureListCapacity(self,
aList,
targetSize)
Ensure that the given list is at least the given size. |
|
|
|
|
|
outputMatrix(self,
objIterFactory,
outFile)
Utility method to calculate a similarity for every pair of objects
that come out of the iterators of teh objIterFactory and output them
to the outFile as a tab-delimited matrix of values. |
|
|