O título da minha tese é: “Revisão Taxonômica das Serpentes da Tribo Philodryadini Cope 1886 (Serpentes: Dipsadidae); Aplicação de Métodos Comparativos Usando Dados Morfológicos e Moleculares.” e sou orientado pelo Professor Dr. Hussam Zaher.
Escalamento e eliminação do efeito do crescimento alométrico em variáveis morfométricas.
O emprego de variáveis morfométricas e matrizes de dados morfométricos multivariadas são de constante uso em análises de variação morfológica em diferentes áreas da biologia, como taxonomia, evolução, sistemática, e ecologia. No entanto, a natureza destas variáveis não é geralmente tomada em conta na hora de realizar análises que visem compreender a variação da forma e não a variação do tamanho.
A minha proposta consiste em criar uma função que escale variáveis morfométricas 1) ao aplicar uma técnica de normalização 2) para escalar variáveis que exibem crescimento alométrico, produzindo um conjunto de dados pronto para análise, sem o efeito do tamanho (size-free). A função recebera uma matriz de dados morfométricos originais e um conjunto de argumentos e aplicara a transformação. No final, haverá dois conjuntos de dados, o original e o escalado. Como resultado, a função realizara um PCA para cada conjunto de dados e apresentará os gráficos de forma comparativa, o original do lado do escalado. Da mesma forma, e com a definição previa de um argumento que selecione alguma variável categórica que defina grupos na matriz morfométrica, uma tabela comparativa apresentará os resultados de um teste de MANOVA realizado a cada conjunto de dados (Original e Escalado).Lleonart et al. (2000)
size.free {unknown} R Documentation Data transformation that removes the allometric effect of body size from morphometric data Description: Remove the effects of allometric growth on morphometric variables. This removal is performed by applying the proposal of Lleonart et al (2000), using a modification of the allometric growth equation. Usage: size.free(x,vpd=...,vf=...) Arguments: x An R object of class “data.frame”. In the data set the rows must be the individuals and columns numeric variables, with at least one variable of class “factor”. vpd Column number (x[,i]) in the data set (data.frame) that indicates which variable will be used as the standard body size to calculate the constants of growth for each variable. vf Column number (x[,i]) in the data set (data.frame) that indicates the factor variable that will be used to performs an MANOVA on the original and scaled variables. Details: The transformation is made by applying the equation 13 of Lleonart et al. (2000:page 88). Yi*=Yi(X0/Xi)^b Where Yi is the observed value for the individual i of the variable to be scaled Y, Xi is the observed value for the individual i of the standard body size variable X, X0 is the mean of the standard body size variable, and b is the growth constant. The growth constant is calculated using linear regression. Value: Graphics Returns a comparative plot (Scaled next to Original data) of principal component analysis (prcomp). MANOVA Returns the summary of a MANOVA (manova) performed with the Original data followed by other MANOVA made with the Scaled data, both using the factor variable (vf) as factor. File Saves a file in Comma-separated values format (.csv) with the scaled data in the working directory. Warning: If there are any NAs or zeros in the input data set, an error message will indicate in which column are these. Note: NAs and zero values are not allowed. Author(s): Juan Camilo Arredondo jcas36@gmail.com References: Lleonart J., J. Salat, & G. T. Torres. 2000. Removing allometric effects of body size in morphological analysis. Journal of Theoretical Biology 205:85–93. http: dx.doi.org/10.1006/jtbi.2000.2043 Thorpe RS. 1976. Biometric analysis of geograph ic variation and racial affinities. Biological Reviews 51: 407–452. http:dx.doi.org/10.1111/j.1469-185X.1976.tb01063.x Examples: ## data.frame with NAs and zeros (df<-data.frame(matrix(sample(c(0,NA,8:10),90,replace=TRUE),ncol=6))) ## Factor variable (SP<-sample(paste("SP",1:3,sep=""),15,replace=TRUE)) ## Including the factor variable into the data set df$SP<-as.factor(SP) head(df) # Applying the function size.free size.free(df,vpd=2,vf=7) ## Eliminating NAs (jn<-df[sapply(df,is.numeric)]) (jn[is.na(jn)]<-sample((jn[!(is.na(jn))]),length(jn[is.na(jn)]))) df[,1:ncol(jn)]<-jn # Applying the function size.free size.free(df,vpd=2,vf=7) ## Eliminating zeros (jn<-df[sapply(df,is.numeric)]) jn[jn==0]<-sample(jn[!jn==0],length(jn[df==0])) df[,1:ncol(jn)]<-jn # Apllying the function size.free size.free(df,vpd=2,vf=7)
## vpd is used as argument to define the standard body size variable and vf is the factor variable size.free<-function(x,vpd=...,vf=...) ## Extract the numeric variables and store it into a new data frame {x1<-x[sapply(x,is.numeric)] ## Extracts the factor variable to a new object Factor<-as.factor(x[,vf]) ## Extracts the standard size variable to a new object vpadrao<-x[,vpd] ####### NAs and Zeros ####### ## Creates a logical object to determinated the presence of NAs at any position of the data frame and stores the column of each one {Na<-sapply(x1,function(x)any(is.na(x))) ## Creates a logical object to determinated the presence of zeros at any position of the data frame and stores the column of each one zero<-sapply(x1,function(x)any(x==0)) ## Conditional statement that uses logical evaluation for determination of NAs if(any(Na)) ## Conditional statement that stops the process and shows an error message indicating treatment of NAs {stop(paste("Please replace NA in column",paste(which(Na),collapse=", ")))} ## Conditional statement that uses logical evaluation for determination of zeros else if(any(zero)) ## Conditional statement that stops the process and shows an error message indicating treatment of zeros {stop(paste("Please replace Zeros in column",paste(which(zero),collapse=", ")))}} ####### Growth Constant ####### ## Copies the data into a new object x2<-x ## Erase the standard body size variable from the data set x2[,vpd]<-NULL ## Extract the numeric variables and store them into a new data frame, reducing to n-1 variables (data set without the standard size variable) x3<-x2[sapply(x2,is.numeric)] ## Creates a matrix of NAs to later include the regression coefficients {abs<-matrix(NA,ncol=2,nrow=ncol(x3)) ## Creates a matrix of NAs to later include the growth constants b<-rep(NA,ncol(x3)) ## for-loop function that calculates the coefficients of each variable and stores into a new object for (i in 1:ncol(x3)) ## Calculates the regression between variables and includes its coefficients into the abs matrix {abs[i,]<-coefficients(lm(log(vpadrao)~log(x3[,i]))) ## Calculates the exponectial of each growth constant and include its value into the object b b[i]<-abs[i,2]}} ####### Transformation ####### ## Creates an object with the total number of elements in the data set to be scaled {l<-nrow(x3)*ncol(x3) ## Transform the data set into a matrix n.m<-as.matrix(x3) ## Creates a matrix with the values of the growth constants, repeated by columns. n.b<-matrix(rep(b,rep(nrow(x3),ncol(x3))),ncol=ncol(x3)) ## Creates a matrix with the values of the standard size variable, repeated by columns. y<-matrix(rep(vpadrao,ncol(x3)),ncol=ncol(x3)) ## for-loop function that calculates the scaled value of each value of the original matrix. for(i in 1:l) ## Calculates the scaled value and include it in a new scaled matrix. {n.m[i]<-n.m[i]*((mean(vpadrao)/y[i])^n.b[i]) ## Tranform the scaled matrix into a data frame object scaled<-as.data.frame(n.m)} ## Includes the standard size variable into the data set of scaled variables scaled$VPD<-vpadrao ## Includes the factor into the data set of scaled variables scaled$Factor<-Factor} ####### PCA ####### ## Original Data ## ## Performs an analysis of principal components (using the covaration matrix) on the original data (only the numeric variables) {pcaOrig<-prcomp(x1) ## Creates an object with the variances of each principal component (columns in pcaOrig$x) vs<-apply(pcaOrig$x,2,var) ## Creates an object with the Proportion of variation of each principal component povar<-vs/sum(vs) ## Saves the loadings of the principal components in a data frame pcs<-as.data.frame(pcaOrig$x) ## Includes the Factor variable into the data set of loadings of the principal components of original data set pcs$Factor<-Factor} ## Scaled Data ## ## Performs an analysis of principal components (using the covaration matrix) on the scaled data (only the numeric variables) {pcaScal<-prcomp(scaled[sapply(scaled,is.numeric)]) ## Creates an object with the variances of each principal component (columns in pcaScal$x) vs2<-apply(pcaScal$x,2,var) ## Creates an object with the Proportion of variation of each principal component povar2<-vs2/sum(vs2) ## Saves the loadings of the principal components in a data frame pcs2<-as.data.frame(pcaScal$x) ## Includes the Factor variable into the data set of loadings of the principal components of scaled data set pcs2$Factor<-Factor} ## Creates a graphic window to include two plots, arranged horizontally par(mfrow=c(1,2)) ## PCA Plots ## ## Plot the First Principal Component against the second, both from the original data {plot(pcaOrig$x,main="PCA of Original Data",xlab=paste("First Principal Componet ",round((povar[1])*100,1),"%"),ylab=paste("Second Principal Componet ",round((povar[2])*100,1),"%"),bty="l",type="n") ## Plot the points as the their respective factor names, for the original variable text(x=pcs[,1],y=pcs[,2],labels=pcs[,ncol(pcs)]) ## Plot the First Principal Component against the second, both from the scaled data plot(pcaScal$x,main="PCA of Scaled Data",xlab=paste("First Principal Componet ",round((povar2[1])*100,1),"%"),ylab=paste("Second Principal Componet ",round((povar2[2])*100,1),"%"),bty="l",type="n") ## Plot the points as the their respective factor names, for the scaled variable text(x=pcs2[,1],y=pcs2[,2],labels=pcs2[,ncol(pcs2)]) } ####### MANOVA ####### ## Performs a MANOVA with the original data {manova.orig<-manova(as.matrix(x1)~Factor) ## Performs a MANOVA with the Scaled data manova.scal<-manova(as.matrix(scaled[sapply(scaled,is.numeric)])~Factor) ## Prints a title previous to the summary of the Wilks MANOVA test for the orginal data cat("Wilks MANOVA test for Original Data\n") ## Prints the summary table of the Wilks MANOVA test for the orginal data print(summary(manova.orig,test="Wilks")) ## Prints a paragraph line cat("\n") ## Prints a title previous to the summary of the Wilks MANOVA test for the Scaled data cat("Wilks MANOVA test for Scaled Data\n") ## Prints the summary table of the Wilks MANOVA test for the Scaled data print(summary(manova.scal,test="Wilks"))} ####### File ####### { ## Saves a file in the working directory with the scaled data write.csv(scaled,"Data Scaled.csv",row.names=FALSE) ## Explanatory text for the function and the arragement of the columns in the scaled file cat("\nExplanation\n\nThe File -Data Scaled.csv- was saved in your working\ndirectory, and contains your scaled variables as a\nproduct of the transformation of your original data,\nfollowing the proposal of Lleonart et al., (2000)*.\nThe file contains an equal number of variables that\nyour original data; however, the variables of the\nstandard size (vpd) and factor (vf) were named as\n-VPD- and -Factor-, respectively, and moved to the\nend of the data set.\n\n* Lleonart J., J. Salat, & G. T. Torres. 2000.\nRemoving allometric effects of body size in morphological analysis.\nJournal of Theoretical Biology 205:85–93.\nhttp://dx.doi.org/10.1006/jtbi.2000.2043") } }