Registration Dossier

Data platform availability banner - registered substances factsheets

Please be aware that this old REACH registration data factsheet is no longer maintained; it remains frozen as of 19th May 2023.

The new ECHA CHEM database has been released by ECHA, and it now contains all REACH registration data. There are more details on the transition of ECHA's published data to ECHA CHEM here.

Diss Factsheets

Physical & Chemical properties

Water solubility

Currently viewing:

Administrative data

Link to relevant study record(s)

Reference
Endpoint:
water solubility
Type of information:
(Q)SAR
Adequacy of study:
weight of evidence
Reliability:
2 (reliable with restrictions)
Rationale for reliability incl. deficiencies:
results derived from a valid (Q)SAR model and falling into its applicability domain, with adequate and reliable documentation / justification
Justification for type of information:
1. SOFTWARE:
The Estimation Programs Interface (EPI) SuiteTM

2. MODEL (incl. version number)
WSKOWWIN v1.42

3. SMILES OR OTHER IDENTIFIERS USED AS INPUT FOR THE MODEL:
Oc1c(C(=O)Nc2ccc(O)cc2)cccc1
CAS no. 526-18-1

4. SCIENTIFIC VALIDITY OF THE (Q)SAR MODEL
- Defined endpoint: Water solubility

- Unambiguous algorithm:
WSKOWWIN (Experimental Water Solubility Database Retrieval) is designed to search a special database of experimental water solubility values and display a value if found. The WSKOWWIN program estimates the water solubility of an organic compound using the compounds log octanol-water partition coefficient (log Kow).
 A database of more than 8400 compounds with reliably measured log Kow values had already been compiled from available sources.  Most experimental values were taken from a "star-list" compilation of Hansch and Leo (1985) that had already been critically evaluated (see also Hansch et al, 1995) or an extensive compilation by Sangster (1993) that includes many "recommended" values based upon critical evaluation.  Other log Kow values were taken from sources located through the Environmental Fate Data Base (EFDB) system (Howard et al, 1982, 1986).  A few values were taken from Section 4a, 8d, and 8e submissions the to U.S. EPA under the Toxic Substances Control Act (see http://www.syrres.com/esc/tscats_info.htm).The database, and its index file, are named "EXPWSOL.DB" and "EXPWSOL.IDX".  WSKOWWIN generates a "structure-representation" for each SMILES entry and then searches the database for a matching "structure-representation".
Water solubilities were collected from the AQUASOL dATAbASETM of the University of Arizona (Yalkowsky and Dannenfelser, 1990), Syracuse Research Corporation's PHYSPROP© Database (SRC,1994), and sources located through the Environmental Fate Data Base (EFDB) system (Howard et al, 1982, 1986).  Water solubilities were primarily constrained to the 20-25ºC temperature range with 25ºC being preferred.

A dataset of 1450 compounds (941 solids, 509 liquids) having reliably measured water solubility, log Kow and melting point was used as the training set for developing the new estimation algorithms for water solubility.  Standard linear regressions were used to fit  water solubility (as log S) with log Kow, melting point and molecular weight.
Melting points were collected from sources such as AQUASOL dATAbASETM,  PHYSPROP©, and EDFB as well as the Handbook of Chemistry and Physics (Lide, 1990) and the Aldrich Catalog (Aldrich, 1992).

- Defined domain of applicability:
WSKOWWIN estimates water solubility for any compound with one of two possible equations.  The equations are equations 19 and 20 from Meylan and Howard (1994a) or equations 11 and 12 from the journal article (Meylan et al., 1996).  The equations are:
 
    log S (mol/L)  =  0.796 - 0.854 log Kow - 0.00728 MW + ΣCorrections
    log S (mol/L)  =  0.693 - 0.96 log Kow - 0.0092(Tm-25) - 0.00314 MW + ΣCorrections
 
(where MW is molecular weight, Tm is melting point (MP) in deg C [used only for solids]) ... Summation of Corrections (ΣCorrections).   When a measured MP is available, that equation is used; otherwise, the equation with just MW is used.

- Appropriate measures of goodness-of-fit and robustness and predictivity:
WSKOWWIN estimates water solubility with one of two possible equations.  When an experimental melting point is available, WSKOWWIN applies the equation containing both a melting point and the molecular weight (MW) parameters.  In the absence of a melting point, the equation containing just the molecular weight is used to make the estimate.  All compounds in the 1450 compound training set have known melting points or are known to be liquids at 25oC.  The accuracy statistics for the two equations are as follows:
 
For Melt Pt + MW equation: r^2 = 0.970; std deviation = 0.409; avg deviation = 0.313
For MW only equation: r^2 = 0.934; std deviation = 0.585; avg deviation = 0.442

- Mechanistic interpretation: estimated value from Log Kow.

5. APPLICABILITY DOMAIN
- Descriptor domain:
EPISuite's database (Appendix E) gives the number compounds in the 1450 compound training set containing each of the correction factors.  The WSKOWWIN program applies an individual correction factor only once per structure [if at all] regardless of how many instances of the applicable structural feature occur in the structure.  The minimum number of instances is zero and the maximum is one.

Range of water solubilities in the Training set:
Minimum  =  4 x 10-7 mg/L (octachlorodibenzo-p-dioxin)
Maximum =  completely soluble (various)

Range of Molecular Weights in the Training set:
Minimum  =  27.03 (hydrocyanic acid)
Maximum =  627.62 (hexabromobiphenyl)

Range of Log Kow values in the Training set:
Minimum  =  -3.89 (aspartic acid)
Maximum =  8.27 (decachlorobiphenyl)

Currently there is no universally accepted definition of model domain.  However, users may wish to consider the possibility that water solubility estimates are less accurate for compounds outside the MW range, water solubility range and log Kow range of the training set compounds.  It is also possible that a compound may have a functional group(s) or other structural features not represented in the training set, and for which no correction factor was developed.  These points should be taken into consideration when interpreting model results..

6. ADEQUACY OF THE RESULT
The WSKOWWIN estimation equations were initially validated on two datasets of compounds that were not included in the model training.  A relatively small dataset was tested that consisted of 85 compounds having experimental log Kow values, but no available melting points.  Many compounds in the 85 compound test set decompose before melting and would theoretically have very high melting points (e.g. amino acids and compounds having multiple nitrogens).  The accuracy statistics for the equation used by WSKOWWIN are:
number 85: r2=0.865; std deviation = 0.961; avg deviation = 0.714

A much larger dataset of 817 compounds was also tested.  All 817 compounds had experimental melting points, but none of the 817 compounds had a reliable experimental log Kow.  The log Kow values used for the validation-testing were estimated (primarily using the KOWWIN program available at that time); therefore, the water solubility estimates are based on estimates for log Kow.  Typically, estimates based on estimates reduce estimation accuracy, but this type of validation can provide insight into the ability of the method.  The accuracy statistics for this dataset are:
number 817: r2=0.902; std deviation= 0.615; avg deviation=0.480.

It was then validated on a dataset of 6584 compounds collected from HODOC (1990) (compounds not used in the training set) with the following statistical accuracy (Stein and Brown, 1994):
 Average absolute error = 20.4 deg Kelvin
 Standard deviation = 38.1 deg Kelvin
 Average error = 4.3%
Guideline:
other: REACH Guidance on QSARs R.6
Principles of method if other than guideline:
Meylan, W.M. and P.H. Howard.    1994b.   Validation of Water Solubility Estimation Methods Using Log Kow for Application in PCGEMS & EPI (Sept 1994, Final Report).  prepared for Robert S. Boethling, U.S. Environmental Protection Agency, Office of Pollution Prevention and Toxics, Washington, DC;  prepared by Syracuse Research Corporation, Environmental Science Center, Syracuse, NY 13210.
Specific details on test material used for the study:
SMILES : Oc1c(C(=O)Nc2ccc(O)cc2)cccc1
MOL FOR: C13 H11 N1 O3
MOL WT : 229.24
Log Kow (estimated): 2.47

Equation used to make Water Solubility estimation:
Log S (mol/L) = 0.796 - 0.854 log Kow - 0.00728 MW + correction factor. (used because Melting Point is not available as experimental value)
Water solubility:
916 mg/L
Temp.:
25 °C
Remarks on result:
other: QSAR predicted value

WSKOWWIN predicted that 2 -hydroxy-N-(4 -hydroxyphenyl)benzamide (also called Osalmid) has a water solubility = 916 mg/L at 25ºC.

Conclusions:
Water Solubility Estimated from Log Kow (2.47 which also is an estimated value) at 25 deg C (mg/L) is 916.
No-melting pt was used in the estimated equation.

Description of key information

Key value for chemical safety assessment

Water solubility:
916 mg/L
at the temperature of:
25 °C

Additional information