LIFE SCIENCE

COSMOtherm/COSMO-RS: Theory and Background

Computational Chemistry was born and grew up with isolated molecules. This state, which is approximately realized in the gas phase, may be identified with the south pole of the world of Comp. Chem. (see Fig. 1) Here lots of powerful tools have been developed and a great variety of molecular properties can be reliably calculated using state of the art quantum chemical programs.

Figure 1: The world of quantum chemistry

But most of the relevant chemistry happens in solution. Different solvents (indicated as islands) are different and each has a certain influence on the energetics and properties of the solute.

To directly calculate a molecule in solution is very complicated due to the large number of solvent molecules required for a realistic representation and due to the thermodynamic averaging which can be achieved by Molecular Dynamics or Monte Carlo methods. Therefore these direct methods are very time consuming and still suffer from the approximations, e.g. the force-field representation without polarization, which are required to make them feasible at all.

Group contribution methods like CLOGP or UNIFAC, which are very common for the calculation of thermo-dynamic partition behaviour since decades, are the other extreme. They are very fast at the costs of a poor representation of the actual chemistry. Molecules are divided into fragments loosing any information about intramolecular interactions.

In this situation dielectric continuum solvation models (CSMs) appear to be a reasonable way out. They allow for a fast and self-consistent treatment of molecules in solution even at quantum chemical level by approximating the most important feature of solvents, i.e. their electrostatic interaction with the solute, by classical dielectric theory.

But although CSMs give reasonable results for the solvents water and alkane, the macroscopic dielectric theory is not valid for solutes in polar solvents. Instead the real value of CSMs like COSMO or PCM is to provide access to the state of ideal screening, i.e. the self-cosistent state of a molecule embedded in a virtual conductor. Although being never realized in nature, this state is an extremely good reference point for molecules in solution. Therefore we identify it with the north pole of the globe of Computational Chemistry. By efficient quantum chemical COSMO calculations, which are no more expensive than gas-phase calculations, we get the total energy of the solute in the conductor as well the screening charge density on the molecular surface.

Figure 2 shows the molecular interactions in liquids considered by COSMO-RS: First, the virtual conducting molecular contact surface (double layer) is computed by the quantum chemical COSMO method. The screening charge density σ is known on both layers. On close contact of the layers contact interactions depending on σ and σ´ only are occuring:

Figure 2: COSMO-RS view of molecular interactions in liquids

COSMO-RS now is a theory which describes the interactions in a fluid as local contact interactions of molecular surfaces (see Fig. 2), and the interaction energies are quantified by the values of the two screening charge densities σ and σ' which form a molecular contact. The most important contributions to the interaction energy functional are the electrostatic misfit energy, and hydrogen bonding.

Figure 3: Screening charge distribution and σ-profile of vanilline

Having reduced all interactions to local interactions of pairs of molecular surface pieces, we now may consider the ensemble of interacting molecules as an ensemble of independently interacting surface segments. In this picture each molecule now is reduced represented by a histogram of surface area with respect to screening charge density, called σ-profile in the framework of COSMO-RS (Figure 3). Some representative σ-profiles are shown in Figure 4. Apparently the σ-profiles provide a rich and detailed quantitative information about the polarity of molecules.

Figure 4: σ-profiles of representative liquids

This reduction of interacting molecules to pairs of interacting surface pieces enormously simplifies the problem, since the statistical thermodynamics of the latter can be exactly solved in milliseconds on a PC (Figure 5). As a result we are able to calculate the chemical potential of an almost arbitrary solute X in an almost arbitrary liquid system (solvent) S, even of mixtures.

COSMO-RS Statistical Thermodynamics (Figure 5) follows from the following three considerations:
(1) Replace ensemble of interacting molecules by an ensemble S of int. Pairs of surface segments
(2) Ensemble S is fully characterized by its σ-profile p(σ) (note: p(σ) mixtures is additive!!!)
(3) The chemical potential of a surface segment with charge density σ-profiles is exactly(!) described by:

Figure 5: COSMO-RS statistical thermodynamics

Now we are able to calculate arbitrary liquid-liquid equilibrium properties like partition coefficients, activity coefficients, solubilities, heats of mixing, etc. In addition we can calculate vapor pressures and partial pressures as well, making use of the energy difference of isolated and perfectly screened solutes, and adding a heuristic, but plausible term, which describes the dispersive energy contributions.

COSMO-RS is dependent upon the underlying quantum chemical method and basis set only. COSMO-RS uses a very small number of adjustable parameters (8 inherent parameters and 1 additional for each element). The parameterization data set consists of pure compound data at T=25°C: 6000 data points of experimental physico-chemical data for 800 small molecules built of elements H, C, N, O, F, Cl, Br, I and S. Parameterization data consists of Solvent-Water Partition logP(S-H2O), where S = Octanol, Benzene, Hexane, CCl4, Ether and Ethylacetate, temperature dependent vapor pressure, free energy of hydration, activity coefficient in aqueous infinite dilution. The parameterizations are evaluated on a validation test set of 2000 data points, consisting of temperature dependent activity coefficients, Henry law constants and vapor pressures for various solutes in different solvents S, where the solvents used in the evaluation data set are not part of the parameterization dataset. The accuracy of the parameterization and validation data sets can hold as an approximate measure for the expectable accuracy of COSMO-RS: Chemical potential differences are reproduced with a RMS accuracy of ~0.35 kcal/mol which corresponds to a factor of 2 in the equilibrium constant.

Figure 6: Phase diagrams of mixtures.

COSMO-RS has been shown to be able to reliably describe phase diagrams of binary (see Figure 6) and multi-component mixtures. Due to the fact that only unimolecular QC/COSMO calculations are required, which can be stored in a database for multiple use, and since the statistical thermodynamics does take only milliseconds, COSMO-RS is a versatile and efficient tool (see flow diagram in Figure 7) for the treatment of solvation problems. Thus COSMO-RS becomes a valuable supplement for the well established, but physically much less profound group contribution methods like UNIFAC. A comparison of these methods is given below.

Beyond the good quantitative results COSMO-RS provides a perfectly new understanding of interactions in solution. The σ-profiles provide a new way of comparison for chemical compounds, and the moments of these distributions, i.e. the s -moments are excellent linear descriptors for arbitrary partition behaviour. Therefore these are of great value for QSAR studies in drug design and related questions. Finally it should be mentioned, that the description of molecular interactions by the screening charge densities gives an integrated understanding of electrostatics and 'lipophilicity' which often are considered to be separate physical entities in molecular field analysis tools. Thus any partition coefficient can be described as surface integrals of the respective chemical potential difference and hence trivially visualized by a colored molecular surface.

It should be noted in addition that several applications to polymers, e.g. solubility of compounds in polymers and oil swelling of polymeric matrices, gave surprisingly good results.

In contrast to group contribution methods and force-field MD and MC methods which meanwhile have been developed over decades by lots of scientific groups, COSMO-RS is just in its first years and therefore it has an enormous potential for further development. Accuracy as well as breadth of applicability will be considerably enhanced within the next years by a batch of improvement ideas.

Comparism of COSMO-RS with UNIFAC

+COSMO-RS needs very few parameters
+COSMO-RS is able to handle rare and exotic molecules
+COSMO-RS is able to handle transition states
+COSMO-RS is able to resolve isomers
+COSMO-RS does not make mean field assumptions
+COSMO-RS does not make additivity assumptions
-COSMO-RS is presently slightly less accurate (in the core region of organic solvents)
-COSMO-RS needs a time-consuming QM-calculation (but only once per molecule)
+COSMO-RS is young and full of improvement potential

COSMO meanwhile is available in several Quantum Chemical programs, e.g. MOPAC, AMPAC, MNDO, DMol, Turbomol, ADF, GAMESS-US, ORCA and others. Reliable COSMO-RS parametrizations presently are available for DMol and Turbomol, but others will be provided soon. A special parametrization for semi-empirical methods is being worked out by a PhD. student.

Summarizing, COSMO-RS provides a perfectly novel access to solvation questions. Just now it enables a lot of interesting and valuable calculations which partly are not capable otherwise. Within the next years it will be improved and extended. Thus we consider COSMO-RS to be the solvation method of the future in Computional Chemistry and Chemical Engineering.

More Application Examples

If you are interested to get access to this novel technology COSMOtherm/COSMO-RS please contact us.


© Copyright 1999-2010 COSMOlogic GmbH & Co. KG, All rights reserved | Contact | Impressum