In-silico Analysis of Phenyl Propanoic Acid Derivatives to Design Potent Peroxisome Proliferator-activated Receptor ( PPAR ) Dual Agonists for Type 2 Diabetes mellitus Therapy

Background: Type 2 Diabetes mellitus (T2DM) is an enfeeble and thriving disorder outlined by the hyperglycemia. The exponentially rising prevalence of T2DM, serves a major dilemma to public health and is marked by a fall in reactiveness of the tissues with respect to insulin that can be restored by the stimulation of Peroxisome Proliferator-Activated Receptors (PPARs). Identification of the structural features of ligands is performed in this study, which can help in modelling potent molecules to activate PPARs and normalize hyperglycaemia. Multiple Linear Regression (MLR) analysis was performed on the dataset of the PPAR’s agonists followed by docking analysis and Molecular Dynamics Simulation (MDS). A ‘sum-model’ is developed to design PPAR activators by adding the activities (EC50) of the ligands contrary to individual subtypes (PPARα and PPARγ) and using it as dependent parameter followed by validation using a test set of molecules, calculating the modified r2 and ‘Leave One-Out’ (LOO) method. The sum-model represents the optimum statistical outcome with R2 = 0.745 and R2pred = 0.838 while the corresponding values for the other two models are 0.747 and 0.732 for α-model, 0.600 and 0.677 for γ-model respectively. Docking analysis revealed that hydrogen bonding interactions are fundamental feature essential for the interaction among PPARs and Phenyl Propanoic Acid (PPA) derivatives. Docking of the most active compound to the proteins was followed by dynamically stabilizing the docked complexes of the PPARγ (1FM9) and PPARα (1K7L). The molecule PPARγ was found to stabilize at 2717ps having an RMSD value of 0.607nm while PPARα stabilized at 4500ps with an RMSD value of 1.1213 nm. The potential energy, kinetic energy and total energy of the 1FM9 at 2717ps were found to be -543611 kj/mol, 96167 kj/ mol and -441916 kj/mol, respectively, and that of 1K7L at 4500ps are found to be -2.995595 kj/mol, 542616 kj/mol and -2.45353 kj/mol respectively.


INTRODUCTION
Diabetes is a intricate and persistent illness which requires continuous medical care. 1 Conventionally, persons having diabetes either completely deprive of insulin, a peptide hormone (Type 1 Diabetes mellitus or T1DM) or possess inadequate insulin or the insulin cannot be used effectively (Type 2 Diabetes mellitus or T2DM) among which T2DM reports for 90 to 95% of all diabetic cases. 2,3DM is designated by high levels of glucose in blood and is a heterogeneous clustre of disorders. 4The escalating rate of T2DM and their pay-off in terms of cardiovascular lithality and despair represent a huge threat to public health. 5Therefore, it is crucial to determine the biological targets for developing novel sanative molecules which can effectively cure T2DM as well as other metabolic syndrome.Among various receptors, Peroxisome Proliferator-Activated Receptors (PPARs), belonging to the nuclear hormone receptor family, 6,7 has been given special attention.PPARs gets activated when the ligands bind to their Ligand-Binding Domains (LBDs).PPARs have 3 subtypes specifically PPARα, PPARβ/δ and PPARγ and they divulge a usual impact on target cells as their actions are restricted to specific tissue types.PPARα mainly affects the metabolism of fatty acids and lowers lipid levels on activation, while PPARγ mostly regulates the adipogenesis, energy balance, and biosynthesis of lipid.PPARβ/δ participates in the oxidation of fatty acid, in cardiac and skeletal muscles, but it also regulates the level of glucose in blood and cholesterol levels.
For designing PPAR dual agonists, MLR models for the solitary subtype might not be conducive as it produce knowledge just about upgrading the activity of that specific subtype.An integrated model having a balanced field with regard to Biological Activity (BA) for both subtypes of receptor is necessary.This could be achieved through adding the BA for creating a distinct variable which can be employed to generate a 'combinedmodel' that varies from the originally developed from the experimental BA and retain unique information.That is feasible since BA and chemical properties of a molecule/chemical compound are simultaneous and therefore, distinct parameters of a molecule/ chemical compound may be obliged for revealing two distinct BA because of the inclusion of distinct biological receptors.After manipulating these BA, a relative combination of variables must be examined which can furnish significant information.Concept of manipulation of activities like division and multiplication was used in previous works to solve the hitches of duality and selectivity of composite triggering agents.Alike conception was applied in this study for molecular modeling of PPAR dual agonists.A testified dataset of PPAR dual agonists was considered for building the MLR models, from which, maximum molecules appeared to have potent EC 50 values for PPARα and PPARγ subtype while slightly potent at PPARβ.Using this activity data, we establish our objective to create an MLR model which can forecast the over-all activity at most potent subtypes and would provide evidence about enhancing the activity for the specific subtypes where the majority of the chemical compounds/ molecules are less active.To confirm that the potent ligands interacts on the active site of the receptor docking studies has been carried out, followed by the Molecular Dynamics (MD) simulation of the complexes. 8The coordinates of the ligand in the original target protein grids are selected as active site to carry out docking studies.

MATERIALS AND METHODS
Quantitative Structure Activity Relationship (QSAR) 9 study was performed on a dataset by means of Multiple Linear Regressions (MLR) 10 using software SPSS. 11The dataset consists of a set of 71 derivatives of Phenyl Propanoic Acid (PPA) out of which 46 active molecules (Fig. 1) were considered to explore QSAR study to obtain potent leads intended for the remedy of T2DM. 6, [12][13][14] Structure building and energy minimization was performed by means of Chem3D ultra 15 while generation of descriptors is done with Molecular orientation environment (MOE) software. 16The selection of potential descriptors from a set of various descriptors is a critical procedure in QSAR. 17Sum model as well as individual MLR model for PPARα and PPARγ were prepared using potential descriptors as independent and pEC 50 value as dependent variable.The final model selected based on correlation coefficient (R 2 ), cross validated correlation coefficient (Q 2 ) and least standard error of estimation (see) values along with test set prediction.

Minimization
The structure of ligands considered for the study was drawn and minimized using the chemdraw ultra molecular modeling package. 15On the basis of a topical study carried on the crystal structure of proteins with a resolution higher than 2.5A°, 18 accessible in the Protein Data Bank (PDB), 19,20 we anticipated that PPARα and PPARγ, both molecules shares analogous conformation due to a similar structure in the LBD of the receptor.This study can be useful in establishing the fact that the molecules with structural similarity bind to a target in the same way.All molecules were minimized using Chem 3D Ultra software package. 15

Qsar analysis
The study of the statistical relationship among physicochemical properties of a group of molecules/compounds and their experimental activity is defined as QSAR where EC 50 values are linearly correlated to the descriptor fields.Correlation analysis was performed followed by cross validation analysis.The statistical parameters calculated to judge the models were R 2 , see, Q 2 and modified correlation coefficient (r 2 m).The cross-validated r 2 (r 2 cv) that results in the optimal number of descriptors and least standard error of prediction (sp) are considered.To evaluate the predictive potential of the model, R 2 pred (correlation coefficient), sp (error of prediction) and modified correlation coefficient (r 2 m(test)) of the test set were also determined in all studies.The descriptors having correlation less than 0.6 were considered for model generation as shown in Table 1.

Docking Analysis
To evaluate binding modes of the molecules with PPARγ and PPARα receptors, molecular docking was performed.The optimized chemical structures were used for QSAR and molecular docking studies.A very dynamic docking program FlexX 2.3.1 developed by Rarey and co-workers (presently commercialized by BioSolveIT) 21 was used to examine the binding interaction of the best active compound with the PPAR receptor.FlexX is relied on a vigorous incremental construction algorithm which fragmentize the ligand and then flexibly constructing it in the binding pocket with diverse placement strategies.For the purpose of docking 3 Dimensional structure with a respective resolution of 2.65 Å and 2.5 Å of receptors PPARγ and PPARα is taken from the PDB.19, 20 The active site coordinates of co-crystallized ligand were considered for molecular docking of all of the PPA compounds for calculating the dock scores with FlexX.The atoms within a specified radius of 6.5 A˚ of the co-crystallized ligand were defined as an active site.The docking scores were determined from the docked molecular complexes for each ligand.Docking analysis was executed for 100 conformations, and the top scored complexes/poses with a most negative flexX score were further analyzed.On the basis of ligand orientation and FlexX score, single complex for each best fitted ligand was carefully chosen, and its dock score was built-in to the molecular spreadsheet.

Validation
For the developing a dependable QSAR model having acceptable predictive power, validation is an important step. 22,23External validation is a necessary step to test the predictive power of the QSAR model to ensure that the model will fit well on a new dataset even if it manifests a good predictive ability based on the statistics of the test set.For this purpose modified correlation coefficient (r 2 m) is calculated which can be defined as the measure of the degree of variance among the experimental activity and the predicted ones.The models are accepted only if they satisfy these criteria: R 2 >0.5, Q 2 >0.5, see<0.5 and r 2 m >0.5.24For the validation of docking studies the receptor protein was made to dock with the natural ligand and the mode of binding was observed and compared with the other ligands in the dataset.

Molecular Dynamics Simulation
MD simulations were performed on PPARγ (PDB ID 1FM9) and PPARα (PDB ID 1K7L) receptor-ligand complex by using gromacs 25 to evaluate the stability of the complex.MD studies were performed to explain different behavior of target at the molecular level.However, Gromacs incorporates a more detailed temperature system and has more functionality built in for executing protein ligand interactions. 26,27Energy minimization is a very important step in MD and can be performed using conjugate gradients, lbfgs or steepest descent out of which the default method steepest descent is considered for energy minimization in this work.Cubic box type (with box size 0.5 taken as default) has been preferred over default box type i.e. rectangular, for the purpose of minimizing edge effects in a finite system to apply periodic boundary conditions in which the atoms of the system to be simulated are put into the space filling box, which is surrounded by translated copies of itself.For a system of particles, Potential Energy can be described by a force field, which is a parameter of mathematical functions.GROMOS96 43a1 force field is chosen which is an improved force field suited for molecular dynamics simulation of proteins and is parameterized with cutoff 1.4 nm Lennard-Jones.The molecular dynamics study is done by taking into consideration certain parameters as input such as constraints set as all-bonds, Integrator as MD (molecular dynamics), Berendsen temperature coupling on, and temperature 300k etc.The MD run is set for 2000000 steps for PPARγ and 250000 steps for PPARα which is equal to 4000 and 4500ps respectively.

Statistical Investigation
Three self-determining MLR models were put up with alike training set: (i) α-model, built on basis of the EC 50 of the compounds/molecules as the dependent parameter with respect to PPARα, (ii) γ-model, built on basis of the EC 50 of the molecules for the PPARγ, (iii) 'sum-model', on the basis of summation of BA at both PPARα and PPARγ subtypes.The data set is subjected to stepwise MLR analysis, so as to develop SAR between BA (dependent parameter) and various descriptors (independent variables).Several models were acquired out of which best model chosen on the basis of predictions obtained from the training set (internal prediction) and test set (external prediction).The statistically obtained parameters for MLR models are generated are displayed in Table 2.The optimal statistical results of the sum-model with R 2 > 0.7 and R 2 pred > 0.8 refuse any likelihood of accidental correlation.

MLR Analysis of PPARα Agonists
A similar set of descriptors used for the model preparation as taken in sum model and the equation obtained is shown in Model 1.The statistical quality of the model is: nTr=34; R 2 tr=0.732;see=0.316;Q 2 =0.677; nTs=12 and R 2 ts=0.747,where, nTr is number of molecules in the training set, R 2 tr is coefficient of correlation for the training set, see is the standard error of estimation, Q 2 is cross validated correlation coefficient, and nTs is number of molecule in test set, R 2 ts is coefficient of correlation for test set.The predicted activity for test and training set compounds calculated from the model is given in Table 3.The correlation plot obtained between observed and predicted activity is given in Figure 2.

MLR Analysis of Sum Model
The best univariate model showed BCUT_SMR_1 (nTr = 34, R 2 = 0.292) as the vital characteristic adding to the activity.Descriptors BCUT_SMR_1 and GCUT_SMR_2 (R 2 =0.453) contribute to bivariate model.The BCUT_SMR_1 descriptor is found to be dominating in illumination the variation in BA as supported by the concluding QSAR equation.The values of descriptors (independent variables) applied in Model 3 are not inter correlated (R<0.6)(Table 1).The statistical attributes of the model are: nTr =34; R 2 tr=0.838;see=0.300;Q 2 =0.620; nTs =12 and R 2 ts=0.745.
The acquired MLR model revealed the significance of BCUT_SMR_1, GCUT_SMR_2, BCUT_PEOE_3 descriptors, wiener path (Distance matrix and Adjacency descriptors), and VSURF_ DW13 (Shape, Surface area and volume) descriptors.The positive coefficient of GCUT_SMR_2 and BCUT_SMR_1 indicates that molar refractivity of ligands have an optimistic influence on the activity of molecule respectively.The model derived from the MLR studies revealed the significance of the above mentioned descriptors for the interaction between receptor and ligands.Similar results from MLR studies of the dataset for all PPAR subtypes and sum model justifies that minimum van der Waals surface area favor the activity of molecules and increase in PE, molar refractivity and wiener path number can favor the activity.

Molecular Docking Studies
The hydrogen bonding interactions are the fundamental characteristics obligatory for the binding of PPARs and PPA derivatives.Molecular Docking is a computational approach that uses an algorithm which generates a huge number of potential structures.By this method prediction of favored orientation of one molecule to another while bound to each other to form a steady complex is done.

Protein and Ligand Preparation
Protein preparation is a necessary step in the docking analysis in which it is assumed that the ligand protein complex 3D structure is known.

Hydrogen atoms are added to fill out open valences.
A reference ligand is used in order to assess the success of docking solution.The ligand preparation tool of LeatIT is able to deal with a reference ligand and automatically gives a reliable state of the ligand.

PPARα
The substituted PPA derivatives reportedly exhibit strong agonistic activities against PPARs. 6, [12][13][14] In this study, we monitored the agonistic activities only against PPARs using molecular docking scores and MLR studies.The binding site is constituted/ defined by all the residues which are present in close vicinity from the reference ligand.At the active site, 100 conformational binding modes for the most active and reference ligand were generated.Each of the 100 conformations were observed using pose view diagram, and the best score conformation was chosen.The minimum binding energy indicated that the PPAR receptor (target protein) was successfully docked with PPA derivative is shown in the Table .The possible binding modes of most active PPA derivative at PPARα active sites have been shown in Fig. 3. PPARα protein residues Ile241, Leu247, Ile272, Phe273, Cys275, Cys276, Thr279, Ser280, Tyr314, His440, Tyr464 etc. formed active site in the protein.Residue Ser280, Tyr314, His440 and Tyr464 found to form Hbond with the reference ligand molecule.Most active ligand showed relatively good binding affinity as compared to the rest of the ligands.Visualization of docked ligand in PPARα reveals that the carboxylate group is located in a favorable position to interact with the hydroxyl group of Ser280 and Tyr314 with FlexX score of -15.82.Similar kind of orientations was observed from docking of the natural ligand with the receptor with FlexX score -32.15.

PPARγ
The molecular interactions at PPARγ active sites with natural ligand and the most active PPA derivative were shown in Fig. 4. PPARγ protein residues Gln286, Ile241, Leu453, Ile281, Phe360, Phe363, Cys285, Phe282, Ser342, Tyr327, His449, Arg288, Ile341, Met364, Leu330 formed active site in the protein.Residue Ser342 and Tyr327 found to form Hbond with the reference ligand molecule with FlexX score -31.54.Visualization of docked ligand in PPARγ reveals that the carboxylate group is located in a favorable position to interact with the hydroxyl group of Tyr327 residue while the N atoms behaves as an H bond acceptor.Docking of the natural ligand on the active site revealed that the Tyr327 residue interacts in the similar manner as the C25 with FlexX score -28.90.

Molecular Dynamics Analysis of PPARγ and PPARα
In Molecular dynamics studies on PPAR nuclear receptors gives a stable protein structures at 2717ps (for PPARγ) and 4500ps (for PPARα) respectively.The application of explicit-water MD computations allow for assessing the movement of the systems.For the examined systems, the RMSDs of all the side chain atom positions from their prior configuration as a function of simulation time are shown in Fig. 5 (A).The RMSD values are reviewed

Validation
All the MLR models were confirmed using a test set containing 12 compounds and the results are shown in Table 3.The prediction efficacy of the models was acceptable.The predicted EC 50 is of the Sum-model is compared with the sum of predicted EC 50 of individual models (last three columns of Table 3).
The resultant values of EC 50 in these columns are fairly comparable and therefore build the sureness in the specified models as expected.The EC 50 of the training set compounds was calculated by means of LOO (cross-validation) technique in QSAR analysis of the α-and γ-subtypes and sum model.The Q 2 , r 2 m and see was found to be 0.677; 0.719; 0.316, 0.649; 0.665; 0.305 and 0.620; 0.788; 0.300 for model 1, 2 and 3 respectively.The statistical results (Q 2 and r 2 m > 0.5) explained the vigorous nature of the selected models.In external validation or test set validation, R 2 pred for model 1, 2 and 3 was found to be 0.732, 0.677 and 0.838 respectively.For satisfactory governing the predictive power of the models, the values of r 2 m of test set were also evaluated.For deducing the closeness between the predicted and the corresponding observed activity, the value of r 2 m ( test ) is predicted, as the upper value of R 2 pred may not persistently signify a small variation among the predicted and observed activity data.The r 2 m ( test ) was obtained as 0.508, 0.594 and 0.529 for α-and γ-subtypes and sum model respectively.All models in this study revealed with superior R 2 pred (>0.5) and r 2 m ( test ) (>0.5) values that explained the supremacy of the models.

CONCLUSION
The 'sum-model' gave a better overall prediction as compared to the individual PPARα and PPARγ models and the sum-model may be used to sketch new compounds having better 'overall activity' in comparison to the molecules of the dataset, which are the molecules of present concern for T2DM therapy.The 'sum-model' suggested in this paper offers acceptable result for both PPARs dual agonists.Furthermore the docking analysis revealed that the hydrogen bonding interactions are essential for the binding of ligands to the receptor.For evaluating the stability of the receptor ligand complex, molecular dynamics studies were performed and the results were quite acceptable.The examination of water conditions within the binding pocket during the molecular dynamic simulation revealed that the presence of sufficient water molecules inside binding pocket facilitates the hydrogen bonding interaction between receptor and ligand.This approach may well implement for identifying further known multiple triggering drugs.

Fig. 2 .
Fig. 2. Correlation plot between experimental and predicted activity of training and test set molecules of Sum-Model, PPARγ and PPARα agonists

Fig. 3 .
Fig. 3. Molecular interaction using FlexX at the PPARα receptor (1k7l) binding pocket A) Stereoview of docking with the natural ligand.B) Stereoview of docking with the most active ligand C29.C) with the natural ligand.D) Poseview diagram of molecular interaction with the most active ligand C29

Fig. 4 .
Fig. 4. Molecular interaction using FlexX at the PPARγ receptor (1fm9) binding pocket A) Stereoview of docking with the natural ligand.B) Stereoview of docking with the most active ligand C25.C) Poseview diagram of molecular interaction with the natural ligand.D) Poseview diagram of molecular interaction with the most active ligand C25 45353 kj/mol, respectively.The stability and behavior of the model can also be assessed by analyzing the root mean square fluctuations (RMSF) which is found to be 1.108 for PPARγ and 0.555for PPARα as shown in Fig.5 (B).The mean square distance of atoms from the centre of mass is referred as the radius of gyration (Rg) is obtained as 2.467 nm and 3.488 nm respectively The stability and behavior of the model can also be assessed by analyzing the root mean square fluctuations (RMSF) which is found to be 1.108 for PPARγ and 0.555 for PPARα as shown in Fig.5 (C).

Table 3 : Actual (Act) and predicted (Pred) pEC 50 values and the residuals (Res) of training and test set molecules for the
α-,γ-and sum model stability of the equilibrated trajectory and whether the convergence of the calculations is obtained.The values of RMSD for the stabilized conformer found within 0.607nm and 1.1213 nm for PPARγ and PPARα respectively demonstrating the conformational stability of the system.The potential, kinetic and total energy of the 1FM9 at 2717ps are found to be -543611 kj/mol, 96167 kj/mol and -441916 kj/mol, respectively, whereas that of PPARα molecule are found to be -2.995595kj/mol, 542616 kj/mol and -2.