Registration Dossier

Data platform availability banner - registered substances factsheets

Please be aware that this old REACH registration data factsheet is no longer maintained; it remains frozen as of 19th May 2023.

The new ECHA CHEM database has been released by ECHA, and it now contains all REACH registration data. There are more details on the transition of ECHA's published data to ECHA CHEM here.

Diss Factsheets

Physical & Chemical properties

Partition coefficient

Currently viewing:

Administrative data

Link to relevant study record(s)

Referenceopen allclose all

Endpoint:
partition coefficient
Type of information:
(Q)SAR
Adequacy of study:
supporting study
Reliability:
2 (reliable with restrictions)
Rationale for reliability incl. deficiencies:
results derived from a valid (Q)SAR model and falling into its applicability domain, with adequate and reliable documentation / justification
Justification for type of information:
1. SOFTWARE
EPISuite v4.11
2. MODEL (incl. version number)
KOWWIN v1.68
3. SMILES OR OTHER IDENTIFIERS USED AS INPUT FOR THE MODEL
Smiles: CC(N)C(=O)NC(CSSCC(NC(=O)C(C)N)C(O)=O)C(O)=O
4. SCIENTIFIC VALIDITY OF THE (Q)SAR MODEL
- Defined endpoint: Partion coefficient octanol/water (log Kow)
- Unambiguous algorithm: KOWWIN uses a "fragment constant" methodology to predict log P. In a "fragment constant" method, a structure is divided into fragments (atom or larger functional groups) and coefficient values of each fragment or group are summed together to yield the log P estimate. KOWWIN’s methodology is known as an Atom/Fragment Contribution (AFC) method. Coefficients for individual fragments and groups were derived by multiple regression of 2447 reliably measured log P values. KOWWIN’s "reductionist" fragment constant methodology (i.e. derivation via multiple regression) differs from the "constructionist" fragment constant methodology of Hansch and Leo (1979) that is available in the CLOGP Program (Daylight, 1995). See the Meylan and Howard (1995) journal article for a more complete description of KOWWIN’s methodology.
- Defined domain of applicability: Currently there is no universally accepted definition of model domain. However, it should be considered that log P estimates may be less accurate for compounds outside the molecular weight range of the training set compounds, and/or that have more instances of a given fragment than the maximum for all training set compounds. Although the training set of the model contains a large number of diverse molecules and can be considered abundant, it is also possible that a compound may be characterised by structural features (e.g. functional groups) not represented in the training set, with no respective fragment/correction coefficient developed. These points should be taken into consideration when interpreting model results.
- Appropriate measures of goodness-of-fit and robustness and predictivity: Please refer to 'attached justification' for more detailed information.
- Mechanistic interpretation: KOWWIN’s "reductionist" fragment constant methodology (i.e. derivation via multiple regression) differs from the "constructionist" fragment constant methodology of Hansch and Leo (Hansch, C. and Leo, A.J., Substituent Constants for Correlation Analysis in Chemistry and Biology, Wiley, New York, 1979). More complete description of KOWWIN methodology is described in: Meylan, W.M., and Howard, P.H., Atom/Fragment Contribution Method for Estimating Octanol-Water Partition Coefficients, J. Pharm. Sci 84: 83-92, 1995.

5. APPLICABILITY DOMAIN
- Descriptor domain: Molecular weight, type of „fragment“
- Similarity with analogues in the training set:
APPENDIX D of the HELP section in KOWWIN v1.68 contains the fragments used in the training set. The substance consists of fragments which are part of the training set. Moreover, as depicted above also the logKow´s of chemicals exceeding the molecular weight of the training set and/or exceeding the complexicity of the training set fragments are predicted with sufficient accuracy.

6. ADEQUACY OF THE RESULT
As explained in detail in the sections above the substance falls within the range of reliable predictivity. The substance falls within the molecular weight range of the model. Furthermore, the substance consists of common functional groups, thus, the results of the estimation are considered to be sufficient to fulfill the information requirements for registration.
Qualifier:
no guideline followed
Principles of method if other than guideline:
- Software tool(s) used including version: EPISuitev 4.11
- Model(s) used:KOWWIN v1.68
- Model description: see field 'Attached justification'
- Justification of QSAR prediction: see field 'Attached justification'
GLP compliance:
no
Type of method:
other: QSAR estimation
Partition coefficient type:
octanol-water
Key result
Type:
log Pow
Partition coefficient:
-2.9
Temp.:
25 °C
Remarks on result:
other: pH value not reported
Conclusions:
In this study report the partition coefficient of N,N'-di-L-Alanyl-L-cystine (CAS No 115888-13-6) was estimated using EPISuite/KOWWIN v1.68. Based on the results the logKow is considered to be -2.9
Endpoint:
partition coefficient
Type of information:
(Q)SAR
Adequacy of study:
key study
Reliability:
2 (reliable with restrictions)
Rationale for reliability incl. deficiencies:
results derived from a valid (Q)SAR model and falling into its applicability domain, with adequate and reliable documentation / justification
Justification for type of information:
1. SOFTWARE
ACD/Percepta 2018.1.1 (Build 3044. 9 Aug 2018)

2. MODEL (incl. version number)
ACD/Percepta 2018.1.1 (Build 3044. 9 Aug 2018)

3. SMILES OR OTHER IDENTIFIERS USED AS INPUT FOR THE MODEL
Smiles: CC(N)C(=O)NC(CSSCC(NC(=O)C(C)N)C(O)=O)C(O)=O

4. SCIENTIFIC VALIDITY OF THE (Q)SAR MODEL
- Defined endpoint:
log Kow (log P) – The logarithm of a ratio of concentrations of un-ionized compound between its solutions in n-octanol and water: LogKo/w

The dataset used to develop the reported model has been compiled from a great number of different sources covering a wide variety of experimental protocols used to determine log Ko/w values reported within them. This includes the classical potentiometric log Ko/w determination methods involving phase titrations, as well as more contemporary and most modern chromatographic methods like HPLC on standard and modified (immobilized artificial membrane (IAM) and liposome chromatography) resins or capillary electrophoresis and centrifugal partition chromatography. Since log Ko/w takes into account only partition of neutral species, when the method involves only single data point measurement (i.e. the log Ko/w is not determined by extrapolation from a pH dependence curve), the water phase is usually buffered to a pH in which the predominant state of the analyzed compound is neutral. For a comprehensive overview of the experimentallog Ko/w measurement techniques please see [1].

log Ko/w is a relatively easily measured property. As a result the experimental data quality, which is usually inversely proportional to the complexity of the experiment, is reasonably good. Independent external studies show that the error between the logKo/w measurements performed by different laboratories using the same protocol (reproducibility) can be expected to be within 0.5 logarithmic units [2].

- Unambiguous algorithm:
Linear fragmental QSAR model
Summation of additive increments of the following types:
(i) carbon atoms that are not doubly or triply connected to any heteroatom (so called Isolating Carbons, or ICs);
(ii) functional groups (FGs) obtained after removing all ICs from a given molecule;
(iii) inter-fragmental interactions (FIs) between any pair of FGs separated by up to four isolating atoms (larger atom chains are considered if direct conjugation between the interacting groups can occur, e.g. in naphtalene and larger aromatic policycles).

log Ko/w = SUM[1..n](ai*f(IC)i) + SUM[1..n](bi*f(FG)i) + SUM[1..n](ci*f(FI)i) +intercept

where f(IC)i, f(FG)i, and f(FI)I is the occurence count of i-th isolating carbon, functional group, and
interaction respectively; ai, bi, and ci - corresponding statistical coefficients.

Increments of ICs and FGs depend on the type of constituent atoms (including branching and cyclization), whereas increments of FIs depend on the interacting groups and separating atoms (type and number) inbetween. If new fragments are involved (that were not present in the training set), their increments are estimated through summation of the "polar atom increments" (that are different from "conventional" IC and FG increments due to internal conjugation or alpha-effect). Missing interaction increments (FIs) involving new functional groups come from either bi-directional Hammet-type equations (for aliphatic interactions) or "generalized atom chains" (for aromatic interactions), the latter assuming that the effect depends on the first atom of a given "missing group" (that is directly attached to Isolating Carbon through which inter-fragmental interaction takes place). New FG and FI increments can also be supplied through the "algorithm training" feature, when software automatically identifies all missing fragments and interactions (and estimates their increments) based on user's own data. In addition, special algorithms were applied for calculating +/- uncertainty errors that depend on increments (and compound classes from the training set) used in calculation (see description of Applicability Domain)

Descriptors in the model:
Fragmental descriptors dimentionless (occurence count) Combination of atomic, fragmental and inter-fragmental descriptors (see Explicit Algorithm)

Descriptor selection:
All increments were derived using the "constructionist" approach that considers step-wise analysis of available compounds in the order of increasing their structural complexity. It starts with simple hydrocarbons (yielding parameters of various Isolating Carbons), then mono-functional compounds (yielding parameters of functional groups), then bi-functional compounds (with clear interactions between two functional groups, yielding parameters of pair-wise inter-fragmental interactions), and so on. After each step the obtained parameters are generalized (to cover the maximum extend of structural diversity) and fixed for the next step of analysis (to minimize ambiguity of new parameters' physicochemical meaning). The exact procedure is described in [3], whereas the resulting efficiency is evaluated in [4] and [5].

Algorithm and descriptor generation:
Algorithm involves (i) splitting structure into Isolating Carbons, functional groups and pairs of interacting groups, (ii) estimating all increments (from pre-defined tables and/or secondary algorithms, as described above), (iii) summing all increments to obtain "global" log Ko/w value, and (iv) estimating the uncertainty error that depends on the least certain increments used in above calculations. All descriptors are generated using the "IC-method" as described in [3,4,6].

Chemicals/Descriptors ratio:
Number of compounds in training set 3,600, number of parameters in the most accurate algorithm (+/-0.3 or better) - 2,500, in the less accurate algorithm (+/-0.5 or worse, when inter-fragmental interactions are estimated by the secondary algorithms) - 500. The large number of parameters comes from the sake of physico-chemical clarity of inter-fragmental interactions (that assigns all correction factors into classical types of inductive, resonance and H-bonding interactions). This number could be easily reduced without a substantial loss of accuracy of predictions (as discussed in [4-6]), yet this would diminish the physicochemical clarity of each parameter and make the algorithm no different from many other "purely statistical calculations"

- Defined domain of applicability:
ACD/Log P is applicable to all types of compounds with molecular weight up to 1,000 Daltons, log Ko/w between -2 and +12, and not bearing any heavy metals that may form coordinating bonds.

Method used to assess the applicability domain:
Depends on the types and numbers of functional groups and interactions between groups, each of which is assigned with particular uncertainty value. Very roughly, the following levels of parameter uncertainty are used:
(i) +/- 0.3 ("very accurate"), when all parameters came from a stepwise analyses of large compound class with consistent data, e.g. any types of hydrocarbons and most types of mono-functional compounds with common polar groups, such as -OH or -CONH2;
(ii) +/- 0.5 ("moderately accurate"), when some parameters came from smaller set of compounds and/or data of weaker consistence, e.g. involving less-common groups such as -NHC(=O)NHC(=S)-, or interacting pairs of groups;
(iii) +/- 1.0 ("poorly accurate"), when some parameters were estimated by "secondary algorithms" (i.e., query structure involved new fragments that were not present in the training set).

On a more delicate level, various combinations of different parameter uncertainty are considered (as described in [3]). All error calculations are based on empirical comparisons of class-specific predictions to experimental data that was not used in algorithm development.

- Appropriate measures of goodness-of-fit and robustness and predictivity:
No data available

- Mechanistic interpretation:
Additivity-consitutivity considerations residing on bi-directional Hammett type equations [3, 4].

[1]Avdeev, A., Absorption and Drug Development: Solubility, Permeability, and Charge State, John Wiley & Sons, Inc., Hoboken, NJ, 2003.
[2]Kishi, H. and Hashimoto, Y., Evaluation of the procedures for the measurement of water solubility and n-octanol/water partition coefficient of chemicals results of a ring test in Japan, Chemosphere, 1989, 18, 1749- 1759.
[3]Petrauskas, A., Kolovanov, E., ACD/Log P Method Description. Persp. In Drug Design, 2000, 19, 1-19
[4]Japertas, P., Didziapetris, R., Petrauskas, A., Fragmental Methods in the Design of New Compounds. Applications of Advanced Algorithm Builder. Quant. Struct.-Act. Relat., 2002, 21, 23-37
[5]Mannhold, R., Petrauskas, A., Substructure versus Whole Molecule Approaches for calculating Log P. QSAR Combi. Sci., 2003, 22, 466-475
[6]Japertas, P., Didziapetris, R., Petrauskas, A., Fragmental Methods in the Analysis of Biological Activities of Diverse Compound Sets. Mini. Rev. Med. Chem., 2003, 3, 797-808

5. APPLICABILITY DOMAIN
The substance has a molecular weight < 1000 and the predicted logKow is between -2 and +12 and therefore fits in the applicability domain. Further, the prediction is calssified as moderate accurate to poorly accurate (+/- 0.79).

6. ADEQUACY OF THE RESULT
The substance fits in the applicability domain of the model. The prediction is valid and can be used for classification and risk assessment.

Principles of method if other than guideline:
- Justification of QSAR prediction: see field 'Justification for type of information'
GLP compliance:
no
Type of method:
calculation method (fragments)
Partition coefficient type:
octanol-water
Type:
log Pow
Partition coefficient:
0.27
Remarks on result:
other: QSAR
Remarks:
Calculated LogP: 0.27 +/- 0.79
Conclusions:
In this study report the partition coefficient of N,N'-di-L-Alanyl-L-cystine (CAS 115888-13-6) was estimated by using the classic logP modul from ACD / Percepta 18.1.1. The logP of N,N'-di-L-Alanyl-L-cystine is considered to be 0.27
Endpoint:
partition coefficient
Type of information:
(Q)SAR
Adequacy of study:
supporting study
Reliability:
2 (reliable with restrictions)
Rationale for reliability incl. deficiencies:
results derived from a valid (Q)SAR model and falling into its applicability domain, with adequate and reliable documentation / justification
Justification for type of information:
1. SOFTWARE
ACD/Percepta 2018.1.1 (Build 3044. 9 Aug 2018)

2. MODEL (incl. version number)
ACD/Percepta 2018.1.1 (Build 3044. 9 Aug 2018)

3. SMILES OR OTHER IDENTIFIERS USED AS INPUT FOR THE MODEL
Smiles: CC(N)C(=O)NC(CSSCC(NC(=O)C(C)N)C(O)=O)C(O)=O

4. SCIENTIFIC VALIDITY OF THE (Q)SAR MODEL
[Explain how the model fulfils the OECD principles for (Q)SAR model validation. Consider attaching the QMRF or providing a link]
- Defined endpoint:
log Kow (log P) – The logarithm of a ratio of concentrations of un-ionized compound between its solutions in n-octanol and water: LogKo/w
The dataset used to develop the reported model has been compiled from a great number of different sources covering a wide variety of experimental protocols used to determine log Ko/w values reported within them. This includes the classical potentiometric log Ko/w determination methods involving phase titrations, as well as more contemporary and most modern chromatographic methods like HPLC on standard and modified (immobilized artificial membrane (IAM) and liposome chromatography) resins or capillary electrophoresis and centrifugal partition chromatography. Since log Ko/w takes into account only partition of neutral species, when the method involves only single data point measurement (i.e. the log Ko/w is not determined by extrapolation from a pH dependence curve), the water phase is usually buffered to a pH in which the predominant state of the analyzed compound is neutral. For a comprehensive overview of the experimentallog Ko/w measurement techniques please see [1].
log Ko/w is a relatively easily measured property. As a result the experimental data quality, which is usually inversely proportional to the complexity of the experiment, is reasonably good. Independent external studies show that the error between the logKo/w measurements performed by different laboratories using the same protocol (reproducibility) can be expected to be within 0.5 logarithmic units [2].

Experimental data from various sources have been used. The characteristics of the entire dataset compiled for the task of this model development is:
No. of compounds = 16277
Min. Value = -5.08
Max. Value = 11.29
Std. Dev. = 1.92
Skewness = 0.22

- Unambiguous algorithm:
Global linear baseline QSAR + local similarity based corrections The global QSAR was developed using PLS in combination with bootstrapping technique. This method implies random compound sampling
from the initial training set, i.e. generation of new “training sub-sets”.

Each of the sampled sub-sets is of the same size as the initial training set, however, random manner of their population results in some compounds being selected more than once, others being omitted. This procedure is performed 100 times and an independent PLS model is derived for every sub-set.
Each of those PLS models is based on 2D fragmental descriptors:

log Ko/w = SUM[i=1..n](ai*fi)+ c

where fi is the number of occurences of the i-th fragment in a molecule, ai - its statistical coefficient, and c - intercept.

As a result, each global QSAR model actually represents an ensemble of 100 PLS models, providing each compound with a vector of 100 log Ko/w predictions, each based on a slightly different sub-set of the initial training set. It is defined that two compounds with similar trends in the variation patterns of the 100 value vectors predicted by a global QSAR model are considered similar in terms of the analyzed property, i.e. the differences in the compound sets used to parameterize each of 100 PLS models, constituting a baseline model, affect estimations for the two compounds in a similar way. The correlation coefficient of the two vectors is called an Individual Similarity Index between two compouds (SIi). An analogous definition of the “property-specific” or dynamic similarity was first used by Tetko and his co-workers [3-7] and this method has been recently used in the analysis of the acute toxicity data [8].

With the available robust similarity measure, it becomes possible to analyse the performance of the baseline QSAR model in the local chemical environment of a query molecule represented by the most similar compounds in the training set. In case any systematic errors are encountered for sufficiently similar compounds, a local correction (Δ) is calculated.
Later on it is possible to train the model quickly and efficiently using new experimental data by just adding it to this second similarity correction calculation procedure, without the time costly baseline model re-training.
Descriptors in the model:
Fragmental descriptors dimentionless (occurence count) Fixed set of fragmental descriptors, based on the expanded list of Platt's type fragments (see [9]). A fixed and relatively small set of fragments was used due to the specifics of the employed modeling methodology. In order for the correlation between two compound vectors of log Ko/w predictions coming from a baseline QSAR model to be representative of compound similarity in terms of the analyzed property, these vectors have to be parameterized using exactly the same set of fragmental descriptors. This prevents the use of any sort of automated fragmentation routines (atom based, isolating carbon based, chain based, etc.) that result in a dynamic set of fragments depending on the training set structures. They leave the possibility that for any query structure from outside the training set the same rules will yield certain new fragments not encountered in the training set molecules which is not compatible with the main condition just mentioned. On the other hand, it is equally important for the model to be able to identify any new structural features of a query molecule that were not present in the training set compounds. I.e., the fixed fragment set cannot be constructed based on the analysis of the training set either, or in general any molecule set whatsoever. Because in that case any new structural features not present in that database would be eventually ignored. As a result, the fragmental descriptor set is based on the general knowledge and considerations regarding all possible chemical structures rather than a finite dataset and include all the fragments, even those that are not detected in the training set molecules at all.

Descriptor selection:
The last fact mentioned in Section 4.3 also excludes the possibility to employ any of the usual descriptor selection techniques relying on the generation of a large initial pool of various descriptors and its subsequent reduction during the statistical analysis (exclusion of statistically insignificant, intercorrelated variables, etc.). Such an analysis by definition would have to be based on a certain dataset, and would not allow having “blank” fragments in the final variable set.

Algorithm and descriptor generation:
The generation of the descriptor matrix following the outlined approach constituted counting the occurences of any of the pre-defined fragments in the trainig set molecules. This procedure as well as all the subsequent statistical analysis were performed using Algorithm Builder 1.8 software.

Software name and version for descriptor generation:
Algorithm Builder 1.8
ACD/Labs, Inc. 110 Yonge Street, 14th floor, Toronto, Ontario, Canada M5C 1T4.
http://www.acdlabs.com

Chemicals/Descriptors ratio:
30.2 (11387 chemicals in the training set, 377 descriptors)

- Defined domain of applicability:
Applicability domain of the model is defined based on the training set compounds. This procedure takes into account the following two aspects:
* Similarity of the tested compound to the training set. No reliable predictions can be made if we have no similar compounds in the training set;
* Consistence of the experimental values with regard to the baseline model for similar compounds. Even if we do have similar compounds in the dataset the quality of prediction could be lower if that data cannot be reproduced by the baseline model. It does not matter what the reason for this inconsistency – experimental variability or sudden change in mechanism of action because of slight structural changes – in any case it indicates possible problems when trying to give accurate predictions

Method used to assess the applicability domain:
The two aspects mentioned above receive their quantitative assessment in terms of Similarity Index (SI) and Data- Model Consistency Index (DMCI). The SI, evaluating how distant the query structure is from the whole training set, is calculated by weighted averaging of all the individual Similarity Indices (S/i) for the test molecule and each of the 5 most similar compounds from the training set. DMCI is calculated by comparing the differences between experimental and global QSAR predicted values for the 5 most similar compounds and the suggested similarity correction value (Δ) for the test compound, calculated by averaging these differences. The more individual differences are scattered around the calculated average (Δ), the more inconsistent are the data for the similar compounds with regards to the global QSAR model.
The final prediction Reliability Index is calculated as a product of the aforementioned two indices:
RI = SI * DMCI
Both SI and DMCI are scaled to vary from 0 to 1, so the resulting RI also varies in this range. Lower values suggest a compound being further from the Model Applicability Domain and the prediction less reliable (low SI or low DMCI either alone or in combination can be the reason). On the other hand, high RI values indicate an increasing confidence about the quality of the prediction (both SI and DMCI have to be high to yield such a result).

Limits of applicability:
Reliability Index < 0.3

- Appropriate measures of goodness-of-fit and robustness and predictivity:
The statistics of the training set data:
No. of compounds = 11387
Min. Value = -5.08
Max. Value = 11.29
Std. Dev. = 1.94
Skewness = 0.25

Statistics provided for the fraction of the training set that falls within the aplicability domain of the model (RI > 0.3 - see Section 5.4)
NRI>0.3 = 11371 (i.e. 99.9% of the training set compounds)
R2 = 0.944
Std. Dev. = 0.457
RMSE = 0.457
F = 402696.2 (Fisher's F-statistics)

The statistics of the validation set data:
No. of compounds = 4890
Min. Value = -4.64
Max. Value = 10.89
Std. Dev. = 1.90
Skewness = 0.16

Random splitting of the initial dataset into the training and validation sets using the ratio 70%:30%.

Statistics provided for the fraction of the validation set that falls within the aplicability domain of the model (RI > 0.3 - see Section 5.4)
NRI>0.3 = 4872 (i.e. 99.6% of all the validation set compounds)
R2 = 0.940
Std. Dev. = 0.464
RMSE = 0.464
F = 165247.5 (Fisher's F-statistics)

Analysis of the subsets of the higher quality results
NRI>0.5 = 4772 (i.e. 97.6% of all the validation set compounds)
R2 = 0.945 Std. Dev. = 0.444
RMSE = 0.444 F = 177716.6 (Fisher's F-statistics)
NRI>0.75 = 3345 (i.e. 68.4% of all the validation set
compounds)
R2 = 0.964 Std. Dev. = 0.360 RMSE = 0.360
F = 197041.9 (Fisher's F-statistics)

- Mechanistic interpretation:
Mechanistic basis of the model:
The only mechanistic consideration utilized in model building is the use of a linear regression method (PLS) and the fragmental descriptors. In other words it is assumed that the final predicted value is composed of a linear combination of all the contributions of structural moieties making up the test molecule. Although very basic, this consideration is one of the most fundamental ones, even the name of (Q)SAR methods implies that the main determinant of all the properties of a compound is its structure. Quite obviously fragments are the best and realy firsthand descriptors of a chemical structure.

A priori or a posteriori mechanistic interpretation:
A posteriori model interpretation results are consistent with generaly understood mechanistic factors or scientific interpretations and well documented experimental facts. I.e., the top ten fragmental descriptors with negative coefficients are the following:
Any positive permanent charge = -2.436
Quaternary ammonium = -1.612
Permanent charge on aromatic N, O, S, Se = -1.317
Sulfonic acid = -1.125
alpha-Amino acid = -0.965
N-oxide = -0.674
tertiary amine (>N-) = -0.673
=S< = -0.670
Any phosphorus atom = -0.573
Lactone = -0..404
Some of those fragments are very well known because of their effect of increasing hydrophilicity of a compound. One more classical example of such water phase favorable group, i.e., hydroxy fragment, follows this TOP10 almost immediately with a statistical coefficient of -0.400
Among the groups with the largest positive coefficients, the absolute majority of them can be clearly expected to increase the hydrophobic properties of a compound, e.g.:
Bicyclo [3.1.1] scaffold = 1.103
Spiro [5.2] scaffold = 1.066
Any Si atom = 0.714
Spiro [6.6] = 0.678
Spiro [6.5] = 0.644
Fused 6:5:5 scaffold = 0.614
Stereohindrance in the form of two bulk branched aliphatic substituents in both orto- positions of a phenolic group = 0.460
n-Pentyl chain = 0.452
n-Heptyl chain = 0.442
Aromatic sulphur =0.419
Note: the average of all 377 statistical coefficients is 0.018
All the fragments encoding strong H-bonding in the aromatic system (e.g., orto-keto, orto-thioketo, orto-nitro, or orto-halogenated phenols and anilines - 6 descriptors in total) have positive coefficients which is in agreement with the known fact that H-Bonding reduces hydrophilicity.
The coefficients of 6 fragments mentioned range from +0.005 to +0.455 with an average of +0.15.
Further similar examples can be established as well.

[1]Avdeev, A., Absorption and Drug Development: Solubility, Permeability, and Charge State, John Wiley & Sons, Inc., Hoboken, NJ, 2003.
[2]Kishi, H. and Hashimoto, Y., Evaluation of the procedures for the measurement of water solubility and n-octanol/water partition coefficient of chemicals results of a ring test in Japan, Chemosphere, 1989, 18, 1749- 1759.
[3]I.V. Tetko, Neural network studies. 4. Introduction to associative neural networks, J. Chem. Inf. Comput. Sci. 2002, 42, 717-728.
[4]I.V. Tetko and P. Bruneau, Application of ALOGPS to predict 1-octanol/water distribution coefficients, logP, and logD, of AstraZeneca inhouse database, J. Pharm. Sci. 2004, 93, 3103-3110.
[5]I.V. Tetko and V.Y. Tanchuk, Application of associative neural networks for prediction of lipophilicity in ALOGPS 2.1 program, J. Chem. Inf. Comput. Sci. 2002, 42, 1136-1145.
[6]H. Zhu, A. Tropsha, D. Fourches, A. Varnek, E. Papa, P. Gramatica, T. Oberg, P. Dao, A. Cherkasov, and I.V. Tetko, Combinatorial QSAR modeling of chemical toxicants tested against Tetrahymena pyriformis, J. Chem. Inf. Model. 2008, 48, 766-784.
[7]I.V. Tetko, I. Sushko, A.K. Pandey, H. Zhu, A. Tropsha, E. Papa, T. Oberg, R. Todeschini, D. Fourches, and A. Varnek, Critical assessment of QSAR models of environmental toxicity against Tetrahymena pyriformis: focusing on applicability domain and overfitting by variable selection, J. Chem. Inf. Model. 2008, 48, 1733-1746.
[8]Sazonovas, A., Japertas, P., and Didziapetris, R., Estimation of reliability of predictions and model applicability domain evaluation in the analysis of acute toxicity (LD50), SAR QSAR Environ. Res. 2010, 21, 127-148.
[9]J.A. Platts, D. Butina, M.H. Abraham, and A. Hersey, Estimation of molecular linear free energy relation descriptors using a group contribution approach, J. Chem. Inf. Comput. Sci. 1999, 39, 835-845.

5. APPLICABILITY DOMAIN
The reliability Index for the prediction is RI=0.32 indicating that the substance is in the applicability domain.

6. ADEQUACY OF THE RESULT
The substance fits in the applicability domain of the model. The prediction is valid and can be used for classification and risk assessment.
Qualifier:
no guideline followed
Principles of method if other than guideline:
- Software tool(s) used including version: ACD / Percepta, ACD / Percepta 14.0.0 (Build 2726. 27 Nov 2014)
- Model(s) used: ACD/LogP GALAS 
- Model description: see field 'Justification for type of information'
- Justification of QSAR prediction: see field 'Justification for type of information'
GLP compliance:
no
Type of method:
other: QSAR estimation
Partition coefficient type:
octanol-water
Key result
Type:
log Pow
Partition coefficient:
< -2
Temp.:
25 °C
Remarks on result:
other: QSAR estimation; pH not reported; Reliability index = 0.32 (moderate)
Details on results:
Experimental results for the most similar structure reported by the program:
L-Cystine ; LogP (used in model): -2.58; Similarity: 0.58 experimental LogP : -5.08
J. Chmelik, J. Hudecek, K. Putyera, J. Makovicka, V. Kalous and J. Chmelikova, Coll. Czech. Chem. Commun., 1991, 56(10), 2030-2041.
Conclusions:
In this study report the partition coefficient of N,N’-di-L-Alanyl-l-cystine (CAS 115888-13-6) was estimated by using the GALAS logP model of the program ACD / Percepta 18.1.1. The logP of N,N’-di-L-Alanyl-l-cystine is considered to be < -2

Description of key information

- QSAR estimation of the partition coefficient using ACD/Percepta 2018.1.1 (Build 3044. 9 Aug 2018) / ACD/LogP classic:: 0.27

- QSAR estimation of the partition coefficient using ACD/Percepta 2018.1.1 (Build 3044. 9 Aug 2018) / ACD/LogP GALAS, log P: < -2

- QSAR estimation of the partition coefficient using EPISuite v4.11/ KOWWIN v1.68, log P: -2.9

Key value for chemical safety assessment

Log Kow (Log Pow):
0.27
at the temperature of:
25 °C

Additional information

There is no experimental data for the partition coefficient of N,N’-di-L-Alanyl-l-cystine available. Three QSAR estimations were performed using three different models.

N,N’-di-L-Alanyl-l-cystine and its structural fragments respectively are within the applicability domain of each of the used models. The highest value is used as key value for chemical safety assessment.