INTELLIGENT FRACTIONAL ORDER ITERATIVE LEARNING CONTROL USING FEEDBACK LINEARIZATION FOR A SINGLE-LINK ROBOT

In this paper, iterative learning control (ILC) is combined with an optimal fractional order derivative (BBO-D a -type ILC) and optimal fractional and proportional-derivative (BBO-PD a -type ILC). In the update law of Arimoto's derivative iterative learning control, a first order derivative of tracking error signal is used. In the proposed method, fractional order derivative of the error signal is stated in term of 's a ' where  to update iterative learning control law. Two types of fractional order iterative learning control namely PD a -type ILC and D a -type ILC are gained for different value of a. In order to improve the performance of closed-loop control system, coefficients of both and learning law i.e. proportional , derivative  and  are optimized using Biogeography-Based optimization algorithm (BBO). Outcome of the simulation results are compared with those of the conventional fractional order iterative learning control to verify effectiveness of BBO-D a -type ILC and BBO-PD a -type ILC


INTRODUCTION
Learning is a characteristic of living creatures, human beings among them. Several endeavors have been made to extend this learning ability to engineering systems in design and construction. Iterative learning control is such an important technique in the field of iterative learning systems, proposed by Arimoto and his colleagues in 1984 [1]. Intelligent iterative learning control falls into the intelligent control category. This involves new techniques to control iterative processes in certain amounts of time. In such control algorithms, the controller learns from its past experiences (iterations) to update itself to improve the performance of the closed loop system.
There are several practical examples in industry in which a system or plant must perform a particular duty successively within a certain time duration. For instance, a robot arm [2] must perform a special task (such as welding, cutting, painting, etc.) in a prescribed geometric path at any iteration. One way to control such a process is to modify a control input from the previous iteration to the next in order to decrease the error between the actual path and desired command [2,3]. Typically, the update law of an iterative learning control can be given by: (1) +1 ( ) = ( ) +  ( ) where +1 ( ) is the input of the process at iteration ( + 1) which  ( ) is a correction term. Several types of the iterative learning controls are developed in which different correction terms of  ( ) are used. A perfect description of linear and nonlinear iterative learning control law can be found in [4]. Convergence conditions of an adaptive iterative learning law is studied in [5]. A monotonic convergence control law is introduced in [6] to control linear discrete-time where a self-tuning iterative learning control for time variant systems is proposed in [7].
Iteration-based systems use the ability to learn and appropriate tuning of the input to perform a repetitive operation, although the iteration process is time and cost consuming. Accordingly, algorithms to reduce the necessary iterations within the learning process attracted more interest [8]. The first fractional order ILC is reported by [8] where a fractional order learning control law of type D α is proposed. The convergence condition, as an important method in fractional order iterative learning control (FOILC), is also analyzed in the frequency domain.
Fractional calculus was primarily inspired by Leibniz in 1695 [9]. A fractional derivative is a proper tool to define properties such as memory and inherency of many materials and processes. As a result, broad research has been performed in recent years in the field of real-time modeling using fractional order differential equations. For example, power transmission lines can be modeled more accurately using fractional order models [10]. Occurrences of chaos in fractional order systems is investigated in a wide range of research [11]. Among other applications of fractional order systems, modeling of the oscillatory behavior of viscoelastic material [12], electrode-electrolyte polarization [13] and quantum evolution of complex systems [14] can be stated.
In recent years, various research was performed in the field of fractional calculus and its controllers. It is because such controllers are more flexible in comparison with integer order ones. A fractional controller increases the degree of freedom to choose the design parameters. This provides an opportunity for the designer to achieve smoother and more appropriate responses. On the other hand, this increases the performance of the controller in terms of the convergence time and also the steady state error [9].
The first fractional order controller was proposed by Oustaloup in 1988 [15] in the CRONE frame work. Later, this controller was extensively used and developed [16,17]. Accordingly, initial fractional order controllers and application of fractional order calculus in control were introduced. In 1999, fractional PID controllers were proposed [18].
Biogeography-based optimization (BBO) was primarily proposed by Dan Simon in 2008 [19]. The BBO technique is a new global optimization algorithm based on the study of the geographical distribution of biological organisms. Mathematical models of biogeography deal with immigration of species from one island to another island, explaining the creation and extinction of species. Dan Simon further investigated the BBO algorithm in [20], using a new version of this algorithm and applying the probability theory. In parallel, Mehmet Ergezer and Dan Simon used biogeography-based optimization for some combined problems [21]. Other applications of biogeography-based optimization can be mentioned as reduction of movement estimation time in video [22], dynamic deployment of wireless sensor networks [23], solving the problem of economic load dispatch [24] and optimal dispatch in the power systems [25].
Although iterative learning control is successfully applied in nonlinear systems [8], the convergence analysis is only presented for linear systems. To overcome this shortcoming in the current manuscript, convergence analysis is performed when a nonlinear system is linearized through a feedback linearization approach. Iterative learning updating law of  type is simulated for 0 ≤  ≤. However, the result will prove to be improved here when choosing 1 ≤  ≤ 2. Furthermore, a new optimal type of fractional PD  of the iterative learning controller (FOILC) will be proposed for the first time. Coefficients of D  and PD types ILC namely, k P , k d , and  , are determined using a BBO algorithm. The rest of the paper is organized as follows: In section Error! Reference source not found., a brief description of fractional calculus is stated. Section 0 introduces integer and fractional order iterative learning control. Motion dynamic and structure of a single-link robot arm is presented in section 0. This will be used as a case study to assess the quality of the proposed fractional type ILC algorithm. In section 5, input-state feedback linearization method is applied on the robot arm. The relevant convergence analysis of the fractional order learning law is given in section 0. A criterion for optimal determination of FOILC coefficients using a BBO algorithm is presented in section 0. Performance of the proposed method is studied during the implementation on a robot arm in section 8. Finally, a conclusion is presented in section 9.

BRIEF DESCRIPTION OF FRACTIONAL ORDER CALCULUS
Fractional order calculus is an extension of the integer order differential. Operator D t  denotes the fractional order derivative-integral, which is defined as: where  indicates order of derivative-integral. For the derivative,  is a positive real value while for the integral, it is a negative real value. Parameters t and c denote the time and initial time respectively. This extension leads to various definitions of fractional calculus. A common definition is Caputo's, which is described as follows: Definition: Necessity of the initial condition such as ( ), ′ ( ) and… requires the use of a new definition of Riemann-Liouville fractional derivative which is called Caputo fractional derivative [26] as in the following: where m is the first integer number lower than . The Laplace transform of the Caputo fractional derivative is shown as: Unlike to the Laplace transform of the Riemann-Liouville fractional derivative, only integer order ( ) is appeared in the Laplace transform of the Caputo fractional derivative. In case of a zero initial condition, the Laplace transform has the form;

INTEGER AND FRACTIONAL ORDER ITERATIVE LEARNING CONTROL
The basic idea of using iterative learning control is shown in Fig. 1. It is assumed that all signals are defined in the time duration of [0, ] where k denotes the number of the experiment(s) or iteration. It means that during k th iteration, prior information of input signal up to ( ), actual output ( ), and error signal ( ) are stored in the memory. This information is used to update the iterative learning control law in iteration + 1 in order to improve the control input. This is necessary to decrease the error between the actual value and that of the desired system output as well as to increase performance of the closed loop system. The new input must be designed such that the error is absolutely less than that of in the last iteration.
System Memory Iterative Learning Control Conventional iterative learning control laws are proportional and derivative. For the derivative iterative learning control updating law,  ( ) contains only a derivative according to [27]: Replacing Eqn. (1) into (6) yields the following update law: Likewise, a proportional iterative learning control updating law generates  ( ) and +1 ( ) which is as follows: In equations (6)-(10), ( ) is the tracking error between the actual output ( ) and the desired path ( ) at th iteration. This can be obtained by: From equations (6) and (8), it can be seen that the way of achieving  ( ) defines the type of iterative learning control law. In the above equation  is the learning gain. Signal ( ) is the control input at k th iteration and k denotes the number of iteration. Parameter t[0, ] indicates the time variable, which may be of a discrete or continuous variable. is the known duration of each iteration. Input (u) and output (y) are not known in advance. Equations (11) and (12), introduce fractional order iterative learning control of  and  types in the time respectively. where  = 0 defines a proportional iterative learning law. Similarly, when  = 1 shows a derivative iterative learning law. Frequency domain of the fractional order iterative learning controls of (11) and (12) are shown as: where ( ) is the Laplace transform of the error ( ) in (10). In equations (11) and (12), the proportional coefficient and the derivative coefficient D are unknown constant learning gains that must be appropriately determined. In the current manuscript, the effect of choosing  in the interval [0,2] on the convergence of the error in (10) will be investigated.

ROBOT ARM STRUCTURE AND THE MOTION DYNAMIC
Robot arms usually perform repetitive operations. It is then meaningful to use benefits of past experience(s) in iterative learning control approaches. This improves the response and increases the efficiency and accuracy. A single-link robot arm [28] as in Eqn. Error! Reference source not found. is used to study the dynamic and to simulate behavior of the proposed FOILC.
where ( ) is the position of the robot hand, ( ) is the applied joint moment (as an input), ( ) is the frictional moment, and l are the mass and the length of the robot arm respectively. Furthermore M is the mass of blade tip, g is the gravity acceleration and finally, J is the joint momenta inertia. The joint inertia and frictional moment are described as follows: where the column friction is assumed [28] as The dynamic of the robot arm in Eqn. (15) will be controlled through a state feedback linearizing controller in the next section.

INPUT-STATE FEEDBACK LINEARIZATION CONTROLLER
A schematic diagram of an input-state feedback linearization technique is illustrated in Fig. 2 where a nonlinear controller will be realized in two steps. First, a state transformation will be used to convert nonlinear dynamic into a linear one in terms of input-state of the plant as: ż= + . Some other conventional (and even advanced) controllers may be used to make the control aim possible. It is primarily necessary to

2-Distribution of
In the above enumerated terms, ( ) denotes the Lie bracket derivative which is defined as: The number indicates the relative order (degree) of the system. This eventually means the required number of differentiations to get the input to appear, which is found to be = 2 for the robot arm dynamic. The Jacobian matrix of ( ) as shown by  , is achieved as: The  Jacobian matrix consists elements of ( ) = ∂ i ∂xj ⁄ . In order to show that the robot arm system can be fully linearized, both enumerated conditions must be met. By evaluating vector fields, matrix ( ) in the first condition can be achieved as: The rank of the ( ) matrix is found 2 (rank( ( )) = 2). The span of  = span{g} is also found involutive. Therefore a state = ( ) alteration, together with the control = ( ) + ( ) transformation, linearize the dynamic in Eqn. (19) in the sense of the inputstate linearization. In this regard, ( ) must be determined such that: Equation (25) immediately yields: As a result, 1 must only be a function of 1 i.e.: Other states can be successively achieved from 1 by: Hence, the input linearizing the dynamic is found as seen in Eqn. (29), together with the relevant coefficients in Eqns. (30) and (31). In conclusion, the input-state transformation in Eqns. (26) to (31) converts the stability analysis of the nonlinear dynamic (15) using the main control input u to the stability analysis of the new dynamic (32) via the new input v. Since the new dynamic is linear and controllable, the following pole placement technique may be used to generate v.
It is seen that the linear state feedback control is capable of arbitrarily placing poles by choosing proper feedback gains.

CONVERGENCE ANALYSIS OF THE PROPOSED FRACTIONAL ORDER CONTROLLER
In this section, convergence analysis of the proposed PD  -type ILC is presented. For this system, input is ( ) where the output is ( ). It is assumed that the linearized plant G C (s), is internally BIBO stable. It is assumed that a unique input can be found such that the desired output trajectory ( ) in  [0, ] can be produced: Using (12) to (14), a PD  -type ILC update law can be found as in the following: where  is defined as follows: It recursively yields:

OPTIMIZATION OF FOLIC PERFORMANCE USING THE BBO ALGORITHM
It is observed in the previous section that if , , and D are properly chosen such that (42) is satisfied, then the proposed D  -type ILC is convergent. In this section, a biogeography-based optimization algorithm [19] will be used to tune those gain parameters by defining a criterion. A BBO algorithm is an evolutionary population-based technique that is inspired by animal and bird migrations to islands. This method has some common properties with other biology-based algorithms such as genetic and suspended particle swarms.
Appropriate islands are shown by a habitat suitability index (HSI) to spot the merit for the life of a biological species is. Factors such as the rainfall, vegetative diversity, land area, temperature, or ground determine properties of HSI. The suitability index variables (SIVs) are other factors to be considered as habitat independent variables while the HSIs are considered as habitat-dependent variables.
Many species live in habitats with high HSI, thus the species emigration rate to the adjacent habitat is high whilst the immigration rate is low. Habitats with low HSI have few species that define a high immigration rate and the lower emigration rate. An example of species abundance in a habitat is depicted in Fig. 3. The emigration rate (m) and immigration rate (l) are functions of the numbers of species in the habitats. The maximum rate of emigration E, occurs when the habitat hosts a maximum number of species that it can support. Whereas the maximum immigration rate to habitat I occurs when there are zero species in the habitat. The species balance number is the point where the emigration and immigration rates are equal (denoted by S0). Immigration and emigration rates are functions of the number of species living in a habitat. They can be respectively evaluated by l and m , which are as follows: In a specific case for E=I, immigration and emigration rates are as follows: where maximum allowable rate of immigration is E, S is the number of S th personal species. The maximum number of species in habitat is shown by max in which the immigration rate is zero and the maximum allowable emigration rate (E) occurs. BBO concept is based on migration and mutation which are dealt in the following.

Migration Strategy
The migration action in the biogeography algorithm is similar to the recombination operator in the genetic and evolutionary algorithm. This is used to modify non-elite responses. Migration can be described as a mapping of ( ) to ( )( ( ) → ( )) [30]. It is assumed that there are N habitats where is the one with the immigration rate l .
is the next habitat with the emigration rate m . An extended migration operator of the standard BBO operator is blended migration which is as follows [30]: In Eqn. (47),  is a real number between 0 and 1 that can be chosen either at random or by the user.

The Mutation Operator
Sudden events lead to deviation of the species number from their mean (balance) value and also sudden changes of habitat HSI. In BBO, this is shown by SIV mutation. BBO algorithm may not lead to an optimal point or may diverge from an optimal point. However, after migration, the mutation operator has to be applied on the achievement so far to prevent diversion (or getting stuck at a point). The main aim of the mutation is to create diversity in the solution set or to increase the habitat among the population [19,31]. The probability of the species number S , denotes that the habitat involves exactly S species. S is updated from to ( +  ) using l and m according to: where Eqn. (48) is used to express the mutation rate. Assume that a habitat with S species is determined to be mutated; change the chosen variable (SIV) based on the existence probability S . The mutation rate ( ) is calculated as follows using the probability of the number of species in habitat S : where is determined by the designer and is the maximum probable number of species. Number S shows the existence of the probability of species S in the habitat.

The Procedure and the Cost Function
To achieve an optimal set of FOILC coefficients, the prescribed BBO optimization algorithm (in section 3) is used. Each SIVs (suitability index variables) of PD  -type ILC controller consists of three parameters , , and k D . Similarly, each SIVs of D  -type ILC controller consists of two parameters:  and k D . To determine the HSI or the cost function for each SIV, perform the following steps: -First, the iterative learning control algorithm begins with an initial value of SIV.
-At the end of the iteration in the ILC algorithm, evaluate the error in terms of the integral of time multiplied by the square value of the error (ITSE). The last term denotes the cost function, which is defined as follows: . dt -The procedure continues until the stop criterion is met. The criteria will be given either in terms of the number of the iterations or the relative discrepancy of successive cost functions.
In Eqn. (50), tsim is the previously described duration of the simulation time. In brief, the extended BBO algorithm diagram is depicted in Fig. 4.
-Generate each habitat SIVs random to determine amount of α, Kp and KD. Then make the habitat matrix using equation (50). Evaluate, arrange and set the iteration counter g=0.

SIMULATION
In this section, a brief description of the desired trajectory, together with the feedback linearization technique, will be given. Thereafter, the proposed fractional order iterative learning control updating law of D  -type ILC and PD  -type ILC is used to verify the capability of the proposed controller. The fractional controller is then used to manipulate the single-link robot arm. Thereafter, the BBO algorithm is used to optimize coefficients of the fractional order ILC controller.
The desired output trajectory of the robot arm is depicted in Fig. 5 [8], which is expressed as follows:  The proposed procedure is performed according to the following steps: -First, the nonlinear dynamic of the robot hand in Eqn. (15) is linearized as schematically shown in Fig. 2.
-The closed-loop poles are placed at the desired location, using the state feedback scheme through the feedback gain vector = [ 1 2 ]. This essentially stabilizes the complete system.
-The proposed ILC (13), as shown in Fig. 1, will be used to generate the control effort via an updating law in Eqn. (13) or Eqn. (14) where applicable. -Ultimately, the BBO algorithm will be used to tune parameters of the fractional ILC.

Fractional-Type ILC
The initial condition of the ILC at any iteration is set to zero. Moreover, the angular velocity is assumed accessible. For  = 1 in (13), the best choice for  is that is determined by (16) [32]. In the meantime, 'N integer' is used to implement the fractional order controllers in MATLAB ® . Fractional order derivative S  , R is calculated using the 'nid' function in the "CRONE" approximation.
The gain k in the 'nid' function is evaluated as the leaning gain  or D , which is equal to k = J. The bandwidth is [0.01 100] rad/s, the number of zeros and poles are assumed to be "n = 5", the applied method is "expansion = cf" and the approximation style is set on "decomposition = all". The above setting leads to achieve the error m ( m = sup t [0,1] 2 ) to be depicted as in Fig. 6, for 40 iterations. The D  -type ILC updating law is used generate the process input at k+1 th iteration. In Fig. 6, a D  -type ILC updating law is used to investigate performance of the integer and fractional iterative learning control. The simulation results are depicted for the square of the tracking error versus the iteration number for different value of . When  = 0, the state variables are only used to update the ILC. In this case, the update law is of a P-type ILC. In contradiction, for  = 1, the derivative of state variables are used, which involves angular acceleration. This means the updating law is of a D-type ILC. Figure 7 shows the required control signal when the ILC has experienced 40 iterations. These control efforts cause the output of D  -type ILC as seen in Fig. 8 for different values of , as illustrated.     error, in terms of maximum absolute tracking error, is depicted in Fig. 11 for different values of .  Time response of actual output y to desired output yd using PD  -type ILC updating law is depicted in Fig. 11 during 40 iterations. It can be seen that for some value of , the PD type ILC updating law can provide the convergence. The obtained control input signal, using PD  -type ILC updating law for different value of , is shown in Fig. 12  From Fig. 13, it can be seen that the PD  -type ILC updating law (for =1.75 and considering the stop criterion m < 0.01  ) convergence is achieved with fewer iterations e.g. 14 iterations.

The BBO Tuned Fractional Types ILC
In this regard, FOILC coefficients are adjusted in collaboration with the previously described BBO algorithm. Parameters of the BBO are set as follows: The maximum iteration is set to 10, the population size to 50, the elite island as 5, the species dimension (SIV) is 3, the maximum mutation rate is mmax= 0.05, and in (44) =0.9 and finally the maximum rate of Emigration and Immigration are set to E=1 and I=1 respectively. Figure 14 shows a uniform convergence of the BBO algorithm when the fitness is gradually decreased. This BBO setting ends with coefficients of the FOILC, as in Table. 1. The time behavior of the error of both PD  and D  -type ILC is illustrated in Fig. 15. The required control effort of using both types of fractional order learning law is depicted in Fig.  16. The results after 10 iterations of the BBO algorithm for BBO-D  -type ILC and BBO-PD  -type ILC are illustrated in Fig. 17.  In comparison with Fig. 15, it can be seen that the convergence speed is increased when FOILC coefficients are tuned by the BBO algorithm. Meanwhile, Fig. 9 confirms that the maximum convergence speed for D  -type ILC occurs in 16 iterations. Whereas using the BBO adjusted D  -type ILC much improves the convergence rate by a factor of 3 times i.e. 5 iterations, as seen in Fig. 14. A similar result is achieved for PD  -type ILC when the maximum convergence speed occurs at iteration 14 (Fig. 11), whereas the BBO designed PD  -type ILC occurs in 4 iterations (Fig. 14). This confirms a huge improvement of the convergence speed when the BBO is used to optimally tune parameters of the ILC.

CONCLUSION
In this paper, by combining fractional order calculus and an ILC controller, the performance of two types of updating laws, called PD  -type ILC and D  -type ILC, are investigated. Necessary conditions of convergence for the proposed PD  -type ILC for the nonlinear robot arm are presented for the first time. A feedback linearization approach is used together with a state feedback to arbitrarily assign the poles of the linearized system.
A simulation is carried out on a feedback-linearized robot arm. A BBO optimization algorithm is proposed to adjust coefficients of the FOILC. The simulation result illustrates that optimal tuning of FOILC coefficients increases the convergence speed. The simulation results confirm the improvement of convergence speed for both types of the proposed PD  and D  types ILC learning laws.