讲解:CS106 Real MolecularJava

实现分子计算器,练习Java三方类库的使用。RequirementThis is the last practical exercise and will continue over the remaining weeks of the course.In this practical you will implement a real molecular similarity methodUltrafast shape recognition to search compound databases for similar molecular shapesSo this problem involves reading from a file one reference molecule calculating a descriptor for it, then reading a series of molecules from a second file, computing the descriptor for each molecule and then quantifying the difference between it and the reference. At the end of the run the program should report the closest molecule and the magnitude of its difference to the reference. All files will be in SD format and hydrogens should be completely ignored in the procedureThe descriptor we will calculate consists of 4 triples of numbers. Each triple consists of 3 statistical measures of distances from a point.The measures areThe mean distance from the point (sum of all distances divided by number of distances)The variance of this distance (sum of the squares of distances - mean all divided by number of distances minus 1)The skew of this distance (sum of the cubes of (distances - mean) / standard dev all divided by number of distances. The standard deviation is the square root of the variance.The four points we use to calculate these from areThe centre of gravitythe closest atom position to the COGThe furthest atom position from the COGThe furthest atom position from point 3 above.To calculate the difference between any 12 double set and another simply do the equivalent of a distance calculation but over all 12 numbers.Remember we know how to read SDfiles from a previous practical, however here is a reminderIn order to access the CDK library you will need some import statements12345import org.openscience.cdk.CDKConstants;import org.openscience.cdk.Molecule;import org.openscience.cdk.DefaultChemObjectBuilder;import org.openscience.cdk.io.iterator.IteratingMDLReader;import org.openscience.cdk.io.MDLWriter;& import org.openscience.cdk.interfaces.*;To read a single SD file you could use something like1234IteratingMDLReader MDLReader = new IteratingMDLReader(new FileInputStream(RefFile), DefaultChemObjectBuilder.getInstance());if (MDLReader.hasNext()) mymol = (Molecule)MDLReader.next();& To read a sequence of files from an SD file1代写CS106 Real Molecular代写Java课程设计2345MDLReader = new IteratingMDLReader(new FileInputStream(ScrFile), DefaultChemObjectBuilder.getInstance());while (MDLReader.hasNext()) mymol = (Molecule)MDLReader.next();MDLReader.close();& To get the name of a Molecule (here called m1) object1Name = new String(String.valueOf(m1.getProperty(CDKConstants.TITLE)));& To get its number of atoms1int natoms = m1.getAtomCount();& you can get each atom in a molecule by1IAtom myatom = m1.getAtom(i);& Where i is the ith atomYou can get the chemical symbol from each atom1String s1 = myatom.getSymbol();& You can get the coordinates as a Point3d object by1Point3d mypoint = myatom.getPoint3d();& (to use Point3d class you have to import javax.vecmath.Point3d)The Point3d class has a method called distance which returns the distance between the instance calling and its argument so123Point3d a,b;...d = a.distance(b);& In addition to the usual criteria of Functionality, readability, comments and a readme file, I request that you prepare a document called plan.txt in which you write a simple logic plan for the program.In order that you don’t get bogged down in the statistics I have given you a set of example methods to calculate mean, variance and skew.转自:http://www.3daixie.com/contents/11/3444.html

你可能感兴趣的:(讲解:CS106 Real MolecularJava)