Registration Dossier

Diss Factsheets

Physical & Chemical properties

Partition coefficient

Currently viewing:

Administrative data

partition coefficient
Type of information:
Adequacy of study:
key study
Study period:
18 April 2017
2 (reliable with restrictions)
Rationale for reliability incl. deficiencies:
results derived from a valid (Q)SAR model and falling into its applicability domain, with adequate and reliable documentation / justification
Calculation method used to estimate Log Kow due to the rapid hydrolysis of the substance to 3-Chlorophthalic acid.
Justification for type of information:
EPI Suite Version 4.11

2. MODEL (incl. version number)
KOWWIN (v1.68)


KOWWIN uses a "fragment constant" methodology to predict log P. In a "fragment constant" method, a structure is divided into fragments (atom or larger functional groups) and coefficient values of each fragment or group are summed together to yield the log P estimate. KOWWIN’s methodology is known as an Atom/Fragment Contribution (AFC) method. Coefficients for individual fragments and groups were derived by multiple regression of 2447 reliably measured log P values. KOWWIN’s "reductionist" fragment constant methodology (i.e. derivation via multiple regression) differs from the "constructionist" fragment constant methodology of Hansch and Leo (1979) that is available in the CLOGP Program (Daylight, 1995). See the Meylan and Howard (1995) journal article for a more complete description of KOWWIN’s methodology.
To estimate log P, KOWWIN initially separates a molecule into distinct atom/fragments. In general, each non-hydrogen atom (e.g. carbon, nitrogen, oxygen, sulfur, etc.) in a structure is a "core" for a fragment; the exact fragment is determined by what is connected to the atom. Several functional groups are treated as core "atoms"; these include carbonyl (C=O), thiocarbonyl (C=S), nitro (-NO2), nitrate (ONO2), cyano (-C/N), and isothiocyanate (-N=C=S). Connections to each core "atom" are either general or specific; specific connections take precedence over general connections. For example, aromatic carbon, aromatic oxygen and aromatic sulfur atoms have nothing but general connections; i.e., the fragment is the same no matter what is connected to the atom. In contrast, there are 5 aromatic nitrogen fragments: (a) in a five-member ring, (b) in a six-member ring, (c) if the nitrogen is an oxide-type {i.e. pyridine oxide}, (d) if the nitrogen has a fused ring location {i.e. indolizine}, and (e) if the nitrogen has a +5 valence {i.e. N-methyl pyridinium iodide}; since the oxide-type is most specific, it takes precedence over the other four. The aliphatic carbon atom is another example; it does not matter what is connected to -CH3, -CH2-, or -CH< , the fragment is the same; however, an aliphatic carbon with no hydrogens has two possible fragments: (a) if there are four single bonds with 3 or more carbon connections and (b) any other not meeting the first criteria.
It became apparent, for various types of structures, that log P estimates made from atom/fragment values alone could or needed to be improved by inclusion of substructures larger or more complex than "atoms"; hence, correction factors were added to the AFC method. The term "correction factor" is appropriate because their values are derived from the differences between the log P estimates from atoms alone and the measured log P values. The correction factors have two main groupings: first, factors involving aromatic ring substituent positions and second, miscellaneous factors. In general, the correction factors are values for various steric interactions, hydrogen-bondings, and effects from polar functional substructures. Individual correction factors were selected through a tedious process of correlating the differences (between log P estimates from atom/fragments alone and measured log P values) with common substructures.
Two separate regression analyses were performed. The first regression related log P to atom/fragments of compounds that do not require correction factors (i.e., compounds estimated adequately by fragments alone). The general regression equation has the following form:
log P = Σ(fini) + b (Equation 1)
where Σ(fini) is the summation of fi (the coefficient for each atom/fragment) times ni (the number of times the atom/fragment occurs in the structure) and b is the linear equation constant. This initial regression used 1120 compounds of the 2447 compounds in the total training dataset.
The correction factors were then derived from a multiple linear regression that correlated differences between the experimental (expl) log P and the log P estimated by the equation above with the correction factor descriptors. This regression did not utilise an additional equation constant. The equation for the second regression is:
lop P (expl) - log P (eq 1) = Σ(cjnj)
where Σ(cjnj) is the summation of cj (the coefficient for each correction factor) times nj (the number of times the correction factor occurs (or is applied) in the molecule).

- Regression Results
Results of the two successive multiple regressions (first for atom/fragments and second for correction factors) yield the following general equation for estimating log P of any organic compound:
log P = Σ(fini) + Σ(cjnj ) + 0.229
(num = 2447, r2 = 0.982, std dev = 0.217, mean error = 0.159)

- Statistical accuracy
Total Training Set Statistics:
number in dataset = 2447
correlation coef (r2) = 0.982
standard deviation = 0.217
absolute deviation = 0.159
avg Molecular Weight = 199.98

Training Set Estimation Error:
within <= 0.10: 45.0 %
within <= 0.20: 72.5 %
within <= 0.40: 92.4 %
within <= 0.50: 96.4 %
within <= 0.60: 98.2 %

- Validation Accuracy
To be effective an estimation method must be capable of making accurate predictions for chemicals not included in the training set. Currently, KOWWIN has been tested on an external validation dataset of 10 946 compounds (compounds not included in the training set). The validation set includes a diverse selection of chemical structures that rigorously test the predictive accuracy of any model. It contains many chemicals that are similar in structure to chemicals in the training set, but also many chemicals that are different from and structurally more complex than chemicals in the training set. The average molecular weight of compounds in the validation set is 258.98 versus 199.98 for the training set.

Total Validation Set Statistics:
number in dataset = 10 946
correlation coef (r2) = 0.943
standard deviation = 0.479
absolute deviation = 0.356
avg Molecular Weight = 258.98

Validation Set Estimation Error:
within <= 0.20: 39.6 %
within <= 0.40: 66.0 %
within <= 0.50: 75.6 %
within <= 0.60: 82.5 %
within <= 0.80: 91.6 %
within <= 1.00: 95.6 %
within <= 1.20: 97.7 %
within <= 1.50: 99.1 %

- Datasets
The KOWWIN training and validation datasets can be downloaded from the Internet at:

An appendix lists KOWWIN atom/fragment and correction factor descriptors with corresponding coefficient values. It also includes the number of compounds in the training and validation datasets containing each descriptor and the maximum number of instances that each descriptor occurs in any single compound.
The training dataset includes a total of 2447 compounds. The validation dataset includes a total of 10946 compounds.

- Estimation Domain
The appendix lists (for each fragment) the maximum number of instances of that fragment in any of the 2447 training set compounds and 10946 validation set compounds (the minimum number of instances is of course zero, since not all compounds had every fragment). The minimum and maximum values for molecular weight are the following:

Training Set Molecular Weights:
Minimum MW: 18.02
Maximum MW: 719.92
Average MW: 199.98

Validation Molecular Weights:
Minimum MW: 27.03
Maximum MW: 991.15
Average MW: 258.98

Currently there is no universally accepted definition of model domain. However, users may wish to consider the possibility that log P estimates are less accurate for compounds outside the MW range of the training set compounds, and/or that have more instances of a given fragment than the maximum for all training set compounds. It is also possible that a compound may have a functional group(s) or other structural features not represented in the training set, and for which no fragment coefficient was developed. These points should be taken into consideration when interpreting model results.
The KOWWIN training and validation datasets can be downloaded from the Internet at:
Substructure searchable formats of the data can be downloaded at:

Given that the molecular weight is within the acceptable range and no fragment appears more than the training set maximum, the value is considered to be acceptable.

Data source

EPI Suite Version 4.11
Bibliographic source:
U.S. Environmental protection Agency. KOWWIN v1.68 (Sept 2010) © 2000

Materials and methods

Test guideline
according to guideline
other: REACH Guidance on QSARs R.6
Version / remarks:
May/July 2008
GLP compliance:
Type of method:
other: calculation
Partition coefficient type:

Test material

Constituent 1
Chemical structure
Reference substance name:
3-chlorophthalic anhydride
EC Number:
EC Name:
3-chlorophthalic anhydride
Cas Number:
Molecular formula:

Results and discussion

Partition coefficient
Key result
log Pow
Partition coefficient:
ca. 2.713
Remarks on result:
other: No information on temperature and pH is included in the prediction.

Any other information on results incl. tables

Table 1: Results



LogKow Fragment Description





Aromatic carbon





-CL [chlorine, aromatic attach]





-C(=O)O [ester, aromatic attach]





Cyclic ester [di-carbonyl type] correction





Equation Constant



Log Kow = 2.7133

Applicant's summary and conclusion

The octanol/water partition coefficient of the test material was calculated as Log Kow 2.7133.
Executive summary:

The partition coefficient of the test material was calculated using KOWWIN v1.68 (Sept 2010) 2000 U.S. Environmental Protection Agency. Given that the substance is an organic molecule within the Molecular Weight range of the training set compounds and no fragment appears more than the training set maximum, the prediction is considered to be acceptable.

The octanol/water partition coefficient of the test material was calculated as Log Kow = 2.7133.