We began with a new computational method for computing Lyapunov exponents of a dynamical system, either from data, or from a known map. Our motivation was the unsatisfactory elements of conventional procedures of exponent estimation using time averages of long orbits of randomly chosen initial points. These included: uncertainty over the orbit length required for convergence of the exponent calculation; the perhaps infeasible computing time for lengthy orbits; numerical instability of the multiplication of hundreds of thousands of matrices; and uncertainty over whether the ``random'' orbit chosen is a typical orbit of the system. Each of these was a very real problem. It seemed that the natural way to overcome these difficulties was to compute the exponents by integrating local divergence over phase space using an estimate of the physical measure of the system, thus eliminating the need for long orbits and the difficulties associated with them. As the Lyapunov exponents of a system must be defined with respect to an invariant measure of the system, to explicitly use an estimate of the physical invariant measure in the computation again seemed the natural thing to do. Our preliminary attempts at exponent calculation using a spatial average were promising, and in fact, this early algorithm not only showed that ``it could be done'', but could also be used as a simple check of time average computations. This early algorithm seemed a little hodge-podge though, and so I moved on to a more integrated approach to the spatial averaging procedure. As we were already using what could be considered a finite state Markov chain to approximate the physical invariant measure, I looked for a way to define the Lyapunov exponents of this Markov chain so that they may be close to those of the deterministic system. We identified whole chunks of phase space in $M$ with a single state in our finite state Markov chain. To each chunk of space we assigned a single matrix that was close to the Jacobian matrices of $T$ in that region of $M$. We defined our transition probabilities in such a way that the random system defined by the Markov chain was a small random perturbation of $T$. As we reduced the diameter of our chunks of space, we simultaneously refined our estimates of the Jacobian matrices, and reduced the magnitude of the random perturbations applied to $T$, thus making the evolution of our Markov chain increasingly similar to that of the deterministic system. It now seemed reasonable to suppose that the exponents of our random system would be close to those of our deterministic system. Unfortunately, there are technicalities which prevent this from begin true in general, even though numerical experiments supported such a claim. The technicalities sprung from the fact that the Lyapunov exponents of $C^1$-close maps are not necessarily close. Such problems also arise for the time averaging algorithm, but are mostly swept under the carpet. According to \cite{lsyprivatecomm}, if each of the directions $w^{(i)}_x$ vary continuously with $x\in M$ for $i=1,\ldots,d$ (a strong hyperbolicity condition), then the exponents of nearby random systems are close to those of the original map.
The only information that is required for our technique is a time series of experimental data; our piecewise linear reconstruction provides us with estimates of the Jacobian matrices, and a partition on which to define the action of our Markov chain. Having an explicit representation of the map at hand is a bonus that extra use may be made of. From a computational point of view, we need only (i) construct the transition matrix $P$ (representing evolution of the system) and find its unique fixed vector, (ii) construct the transition matrices $t^{(i)}$, $i=1,\ldots,n$, (representing the action of the Jacobian matrices on projective space), and (iii) form the matrix $\fM$ from $P$ and the $t^{(i)}$ and compute its fixed vector. These objects are then simply combined using (?) to evaluate the exponents.
We had expected the probability measures $\xi_i$ that describe the directions in which to measure the average stretching of the random system at state $i$ to be related to the special directions $w_x$ for the deterministic systems. If our perturbed system was accurately describing the dynamics of $T$, we expected the probability measure $\xi_i$ to heavily weight those directions lying near $w_x$, where $x$ is lying in state $i$. To my delight, Figure \ref{lyapspikes29} confirmed this. Our spatial average gives four different techniques to calculate the exponents of the Markov chains, based on combinations of the forward and reverse chains, and using the Jacobian matrices and their inverses. These alternative procedures may be used in situations where some (negative, for example) exponents are difficult to resolve computationally, or simply to double-check a calculation. The sum of the Lyapunov exponents of the induced Markov chain is extremely simple to calculate from (\ref{compsumexpts}). In addition to a method of calculating Lyapunov exponents of deterministic systems, we have also developed a very accurate method of estimating Lyapunov exponents of Markov chains. If one were not performing a space average, the exponents of the Markov chain would have to be computed by running out long computer-generated ``random'' orbits, and using the conventional time average techniques with all of their associated problems. The ability of computers to consistently generate truly random orbits is now an added complication not present in the deterministic case. Our space average avoids all of this.
Finally, we mention the only other multidimensional spatial averaging method we are aware of; that of Hsu and Kim \cite{kim}. Their algorithm consisted of a crude construction of a transition matrix on a fine grid in phase space to model the dynamics of the map and provide an estimate of an invariant measure. When it came to deciding which directions to measure the local stretching in, Hsu and Kim chose a single direction, based on a heuristic formula. Their method provided no theoretical basis for the computations and no results on convergence of the computed exponents to the true ones. Further, the map must be known in order to construct their transition matrix. Their method is much more computationally intensive when it comes to the Markov chain description of the map. Their grids have on the order of 800 000 elements, with $2\times 10^7$ data points needed to construct the $810\, 000 \times 810\, 000$ transition matrix for their H\'enon map example. Just to compute the invariant density of this matrix is a serious undertaking. We can achieve results of similar accuracy with transition matrices of around 100--1000 states, being constructed from 20--200 data points. This improvement is most likely due to the fact that for each state, we have developed a method of measuring the stretching in a ``smear'' of directions as should be done, rather than in just a single direction. \vspace{.5cm}
At the conclusion of Chapter 1, the issue of whether or not we were approximating the ``physical'' invariant measure of our dynamical system had not been resolved. As many of the dynamical systems for which we wanted to find physical invariant measures did not have a well-defined notion of a physical measure, we needed a process to make a suitable selection from the infinity of $T$-invariant probability measures. Our idea was to subject the system to small amounts of noise, and then compute the (unique) invariant measure of this noisy system. As the noise level went to zero, we extracted a limiting measure which was shown to be $T$-invariant. As this limiting measure arose as a limit of noisy systems, we argued that it was of physical significance as it was robust with respect to small perturbations in some sense. In this way, we have selected a single $T$-invariant measure from the infinity available. We began Chapter 2 by extending the rather simplistic definition of the finite state Markov chain in Chapter 1 to a Markov chain on the whole of the phase space $M$ with transition function $\bP$. Although our Markov chain now operated on the whole of $M$, the transition function was still finitely describable and had an absolutely continuous invariant measure that was simply calculated as the left fixed vector of the transition matrix $P$ associated with $\bP$. It was shown that this Markov chain was indeed a small random perturbation of $T$, and that we could find invariant measures of $T$ by extracting limit points from a sequence of invariant measures of the perturbed systems with the noise level going to zero. At each finite stage, our approximate measure was absolutely continuous with respect to Lebesgue. If this proved to be a difficulty for integrating continuous test functions with respect to the approximate measure, we noted that a singular representation could also be constructed, so that the two representations shared a common zero-noise limit point. The integration of a test function now just boiled down to evaluating the function at a finite number of points. We identified that the way to decrease the perturbations of our random system was to refine the partition sets. Such a refinement must reduce the diameter of the sets, not just the volume, and we gave suitable refinement procedures in two and three dimensions that reduced diameters while not producing too many extra partition sets. Three examples of constructing estimates of the physical measure from data were given for the H\'enon map (from both a time series, and a set of fixed data points and their images), and a nonlinear torus map (from time series). The numerical results of these experiments were very encouraging. The accuracy of the invariant measures were tested by running out long orbits of the reconstructed systems and forming histograms on the same partition sets as those used to construct the approximations. Our approximate invariant measures gave similar weights to the partition sets as the histograms formed from long orbits; thus our constructed invariant measures appeared to be close to the ``physical'' ones (those exhibited by most long orbits). As a further way of increasing the accuracy of our approximate measures, we considered ``pushing forward'' our measures via the action of the reconstructed map $T$. In cases where the notion of a physical measure has a precise mathematical meaning it may be shown that this pushing forward does improve accuracy. Our estimate of the physical measure of the nonlinear toral map in Figure \ref{shrunkgreyscaleimtorus200} is an excellent result, considering we only had a time series of 11 data points to work with.
Since we had been using added noise as a selection process of the physical measure, the question arose as to whether any old small random perturbation would select a good candidate for a physical measure in the zero noise limit. We saw that this was most certainly not the case via a simple example. We then considered a broad class of perturbations: ones for which (i) the true image is contained in the collection of possible random images and (ii) mass in phase space is spread out, that is, the Lebesgue measure of the random image of a set is always larger than the Lebesgue measure of the set itself. One may have expected that this class of perturbations would always produce a physical measure, particularly in those cases where the physical measure is absolutely continuous with respect to Lebesgue. However, this was not the case. It was shown via two counterexamples that perturbations in this class may produce non-physical measures in the zero-noise limit. The first counterexample was a numerical experiment designed to be a ``believable'' example of something that may actually go wrong. The second counterexample was a contrived situation in which we were able to prove that a non-physical measure lay at the zero-noise limit. Both of these ``bad'' perturbations were obtained from our perturbation by slightly varying the non-zero entries in the transition matrices. We also noted that this varying of the entries in the transition matrices provides a mechanism for Hsu's method of estimating the physical invariant measure to fail as well. In light of our good numerical results, it therefore seems that our perturbations have some extra information that we are not using. As yet, we have no counterexample for our class of perturbations. That is, we have yet to find a map with a well-defined physical measure, and a reasonable sequence of partitions, such that our perturbations defined on this sequence of partitions produce a non-physical measure in the limit. \vspace{.5cm}
In Chapter 3 we sought a proof of the fact that our perturbed systems produced the physical measure in the zero-noise limit. We restricted ourselves to $C^2$ expanding, and transitive Anosov maps, as these systems have well-defined physical measures. We were particularly interested in the Anosov case, as I was unaware of any rigorous extension of Ulam's approximation to multidimensional Anosov systems. A Markov partition was chosen as the partition from which to construct our transition matrices. With this partition, the entries of the transition matrices took on special meanings, giving information on the rate of local stretching of $T$ in unstable directions. The Markov partitions also gave us greater control over the dynamics of $T$, allowing us to code the dynamics as a subshift of finite type. It was known that the physical (SBR) measure arose as an equilibrium state of $T$ for a special weight function, namely $-log|\mbox{local stretching in unstable direction}|$. By using the relative volumes of intersection between the partition sets and their inverse images, we were able to construct a piecewise constant approximation to this special weight function. We then lifted this piecewise constant weight function to our symbol space; it would now be a simple matter to compute the equilibrium state of this function as the fixed left eigenvector of a matrix. It then turned out, in fact, that the matrix that defines the Markov measure that is our approximate equilibrium state is none other then our original transition matrix. To finish off, we showed that as our Markov partition was refined, the piecewise constant approximation to the special weight function became more accurate, and that the equilibrium states for those approximate weight functions converge to the equilibrium state for the special weight function.
We noted that Lebesgue measure need not be used to construct our transition matrix; rather the same result could be achieved using any measure that is equivalent to Lebesgue to construct our matrix. The result could also be extended to Axiom A system by a suitable definition of a ``Markovian'' partition for $M$ (not the attractor of zero Lebesgue measure). We have taken a completely different line of proof to previous proofs for Ulam-type convergence results for expanding maps \cite{li,boyarskyjablonski,dingzhou,dingzhoumarkov}. These proofs have exclusively relied on bounded variation arguments appealing to the Lasota/Yorke inequality. This inequality says that the Perron-Frobenius operator ``smooths out'' the variation of the density functions to which the operator is applied. From this, a universal bound is found for the variation of the sequence of approximate invariant densities, thus showing that this set of densities is relatively compact, and allowing the extraction of a limiting density. As we were after an extension of these convergence results to Anosov maps, a completely new proof technique was necessary, as the Perron-Frobenius operators for Anosov diffeomorphisms in no way smooth out density functions. Our result (\ref{mainresult}) represents the first step that I am aware of toward rigorous computation approximation of SBR measures of Anosov systems. \vspace{.5cm}
In Chapter 4 we sought an extension of the result in Chapter 3 to the situation where Markov partitions are not used. Although the result of Chapter 3 in principle provides us with a rigorous way to approximate the physical invariant measure on a computer, the fact that Markov partitions must be constructed, means that the method is time consuming to implement. Of course once we do away with our Markov partitions, we no longer have the interpretation of our transition matrix elements as estimates of the local stretching in unstable directions. In addition, we lose the ability to code the system as a subshift of finite type and all of the symbolic dynamics formalism that goes with it. We had to start from scratch; our work in Chapter 3 would not be of any help. We again considered $C^{1+\Lip}$ expanding and transitive Anosov maps, seeking to approximate the unique absolutely continuous measure. Our argument was based on a few simple observations; the first of which was that it we used the absolutely continuous measure $\mu$ rather than Lebesgue measure $m$ to construct our transition matrix, the probability measure that we compute as a fixed left eigenvector would give exactly the correct weight to each partition set. This did not seem to be of much use initially, as we don't know beforehand what $\mu$ is. However, we then observed that since $\mu$ was equivalent to $m$, the transition matrices $P$ (constructed from $\mu$) and $\tilde{P}$ (constructed from $m$) were not very different entrywise. In fact, we were able to show that as the diameters of the partition sets decreased, $P$ and $\tilde{P}$ approached each other in norm at a rate of $1/n^{1/d}$ where $n$ is the number of partition sets. Since $P$ and $\tilde{P}$ were close, so too must be their eigenvectors $p$ and $\tilde{p}$. And as $p$ gives exactly the right weight to each partition set, $\tilde{p}$ must give something close to the right weight to each partition set. Further, we showed that if the vectors $p_n$ and $\tilde{p}_n$ converge in norm as $n\to\infty$ (diameter of partition sets goes to zero), then our approximate invariant measure (constructed from $\tilde{p}$) converges to the physical measure of $T$. To bound the distance between $\tilde{p}$ and $p$ we used the inequality (\cite{schweitzer}) involving the fundamental matrix $Z$ for the transition matrix $P$. For us to achieve convergence of $\tilde{p}_n$ to $p_n$ as $n\to\infty$, we needed to show that $Z(P_n)$ grew at most logarithmically with $n$. We noted that the norm of $Z$ was tied to the rate at which initial densities approach equilibrium under iteration of $P$. We expressed the growth rate of the norm of $Z_n$ in terms of both the growth rate of the time to get ``half-way'' to equilibrium and the growth rate of the constant $R_n$ in equation (\ref{geonormcvgceeqn}). It was difficult to say anything useful about the growth rate of these quantities for general maps, so we turned to some cases where we have more control on the mixing properties of the map. We first dealt with the case where $T$ was a piecewise linear Markov map, and used a Markov partition to construction the transition matrix. Here we were able to show that $R_n$ grew at a polynomial rate, and that $r_n$ could be bounded away from unity by a universal constant, implying that in this situation our approximate invariant measures converge to the physical invariant measure. We then turned to piecewise $C^{1+\Lip}$ expanding maps of the interval. We saw that if $T$ stretched phase space strongly and uniformly enough, we could make the Perron-Frobenius operator for $T$ (with respect to $\mu$) a contraction in the Banach space of functions of bounded variation. From this it followed that we could bound $r_0$ away from unity and that $R_n$ grew linearly. This argument could be extended to expanding maps that did not stretch sufficiently strongly or uniformly by constructing $k$-step transition matrices from the map $T^k$ rather than $T$. In this way, we obtain an alternative, simpler proof of the main result of \cite{huntmiller}. Finally, we performed some numerical experiments on a linear toral automorphism, finding that $\|Z_n\|$, $R_n$ and $m_n$ grew logarithmically, polynomially, and logarithmically respectively. Such behaviour implied that our approximate invariant measures converge to the physical invariant measure as the diameters of the partition sets go to zero. We expect to find similar behaviour for non-linear maps.
Our numerical results lent weight to my conjecture that strong expanding properties are not necessary to prove Ulam-type convergence results, but that sufficiently strong mixing properties will be enough. These sufficiently strong mixing properties were displayed by uniformly hyperbolic systems such as Axiom A or Anosov maps, where initial distributions approach the physical equilibrium distribution exponentially quickly. For our perturbation analysis to go through, we needed to show that our transition matrix $P$ is sufficiently insensitive to perturbation of its entries. This meant that we had to appropriately link the dynamics of $T$ to $P$; something that turned out to be quite difficult to do. The computational advantages of our proof are that (i) no special partitions are needed, (ii) the result holds in any dimension, and (iii) only a modest mixing assumption on the map is needed. It is my hope that this argument lays the foundation for a new way to extend Ulam-type convergence results. \vspace{.5cm}
Chapter 5 was motivated by the desire to explore further the connection between the mixing properties of our map $T$ and those of the transition functions $P$ that represent in some rough sense the action of $T$. We began by introducing two different types of mixing, namely w-mixing and o-mixing, the rates of which we expected to vary depending on the space of test functions employed. We first briefly considered trying to link the rate of mixing of our original system to a stochastically perturbed counterpart (with a view to particularising any result to class of perturbations generated by the transition matrices). However, this route was abandoned, as known work indicated that this problem was a difficult one, even for simple maps and pleasant perturbations. Our transition matrices were then considered as representation the action of the Perron-Frobenius operator projected onto a finite-dimensional space of test functions. This finite-dimensional space consisted of piecewise constant functions that were constant on each of our partition sets. We then asked the questions ``How is the spectrum of this projected Perron-Frobenius operator related to that of the real one?'' and ``How is the rate of mixing of this projected Perron-Frobenius operator related to the various types of mixing that the real one displays?''. The latter was of the most importance to us as really we were only interested in how quickly initial distributions approached the equilibrium distribution. We calculated the rates of w-mixing and o-mixing with respect to the functions spaces $L^1(M,\mu)$, $BV([0,1,],\bR)$ and $C^r(M,\bR)$ for a collection of four model maps. These maps displayed varying degrees of mixingness, being non-ergodic, ergodic, mixing (in the measure-theoretic sense), and exact. We compared these six w-mixing and o-mixing values with the value of the second largest eigenvalue of a transition matrix constructed on an ``arbitrary'' fine partition. We were looking for which combination of w-mixing or o-mixing with which set of test functions gave a correspondence between exponential w-mixing/o-mixing and a second largest eigenvalue bounded away from unity in our transition matrix. Based on these results it seemed that if $T$ was exponentially w-mixing with respect to $C^r$ test functions, then the $(T,\mu)$ transition matrices would have their second largest eigenvalue bounded away from unity for arbitrarily fine partitions. We had no proof of this result, however. Further work would involve proving that such a connection was there (or not), and perhaps also find a precise connection between the numerical values of the rate of w-mixing and the second largest eigenvalue of the transition matrices was. This latter problem will probably be quite difficult and only immediately solvable in certain cases, as finding the minimal rates of mixing for maps is still an unsolved problem for all but the simplest transformations; see Rychlik \cite{rychliksmooth} or Liverani \cite{liverani,liveraniannals}.
We had a look at the spectrum of the transition matrices for the H\'enon map, and conjectured that since the second largest eigenvalue seemed to be bounded away from unity, that perhaps the H\'enon map would display exponential w-mixing with respect to $C^r$ functions. The effect of the measures used to construct the $(T,\mu)$ transition matrices on the spectrum of the matrices was also considered. We constructed transition matrices for the map (\ref{kenjimap}) using Lebesgue measure, its unique absolutely continuous invariant measure, and another non-invariant measure equivalent to Lebesgue. Each of these transition matrices had almost identical rates of mixing, as expected from the arguments of Chapter 4. The spectrum of the transition matrix from the numerical counterexample of Chapter 2 was also investigated. This matrix was constructed from the same map (\ref{kenjimap}), however the nonzero entries of the matrix had been slightly altered. We found that the rate of mixing of this new transition matrix was significantly different to those of the three just constructed from measure equivalent to Lebesgue. This finding lent weight to our argument of Chapter 2 that we were approximating a non-physical invariant measure with the eigenvectors of our altered matrix.
Aggregation was introduced as a way of linking transition matrices on refined partitions to those constructed from their coarser counterparts. It provided us with a simple way of computing transition matrices for coarser partitions without having to explicitly construct the coarser partitions and recalculate the areas of intersection. We tried to use the process of aggregation to link the mixing properties of transition matrices constructed from finer partitions with those constructed from coarser ones. Unfortunately, the multistep evolution of a Markov chain has very little to do with that of its aggregate, and aggregation was of no direct use in this way. We noted that Ulam's approximation to the Perron-Frobenius operator is in fact a generalised aggregation of the operator. If we aggregated the Perron-Frobenius operator ``correctly'' (using the physical invariant measure $\mu$), then we would obtain our special transition matrix of Chapter 4. However, the operator is usually (\cite{li,boyarskyjablonski,dingzhou,dingzhoumarkov}) aggregated with respect to Lebesgue measure (though not recognised as such). For this reason, the resulting matrix representation does not give the correct weight to each partition set. This realisation that the Perron-Frobenius operator arises as an aggregation allowed us to consider a perhaps more informed perturbation argument than that proposed in Chapter 4. If two measures $\mu$ and $\tilde{\mu}$ were close (in a weaker relative sense on each partition set, rather than globally), then the two transition matrices that arose as aggregations of the Perron-Frobenius operator would also be close, as would be their invariant densities. Further work could involve developing a perturbation argument that took into account the special structure of these matrices, or avoided the construction of the matrices altogether and proceeded straight to the conclusion that the invariant densities were close. Finally, we presented a heuristic argument as to why our special aggregation is in some sense the best finite representation of the dynamics of $T$.