adityashukzy/bartbasearxivsumsession1
Summarization
•
Updated
•
3.34k
•
1
Error code: JobManagerExceededMaximumDurationError
Need help to make the dataset viewer work? Open a discussion for direct support.
article
string
 abstract
string


"additive models @xcite provide an important family of models for semiparametric regression or classification . some reasons for the success of additive models are their increased flexibility when compared to linear or generalized linear models and their increased interpretability when compared to fully nonparametric models . it is well  known that good estimators in additive models are in general less prone to the curse of high dimensionality than good estimators in fully nonparametric models . many examples of such estimators belong to the large class of regularized kernel based methods over a reproducing kernel hilbert space @xmath0 , see e.g. @xcite . in the last years many interesting results on learning rates of regularized kernel based models for additive models have been published when the focus is on sparsity and when the classical least squares loss function is used , see e.g. @xcite , @xcite , @xcite , @xcite , @xcite , @xcite and the references therein . of course , the least squares loss function is differentiable and has many nice mathematical properties , but it is only locally lipschitz continuous and therefore regularized kernel based methods based on this loss function typically suffer on bad statistical robustness properties , even if the kernel is bounded . this is in sharp contrast to kernel methods based on a lipschitz continuous loss function and on a bounded loss function , where results on upper bounds for the maxbias bias and on a bounded influence function are known , see e.g. @xcite for the general case and @xcite for additive models . therefore , we will here consider the case of regularized kernel based methods based on a general convex and lipschitz continuous loss function , on a general kernel , and on the classical regularizing term @xmath1 for some @xmath2 which is a smoothness penalty but not a sparsity penalty , see e.g. @xcite . such regularized kernel based methods are now often called support vector machines ( svms ) , although the notation was historically used for such methods based on the special hinge loss function and for special kernels only , we refer to @xcite . in this paper we address the open question , whether an svm with an additive kernel can provide a substantially better learning rate in high dimensions than an svm with a general kernel , say a classical gaussian rbf kernel , if the assumption of an additive model is satisfied . our leading example covers learning rates for quantile regression based on the lipschitz continuous but non  differentiable pinball loss function , which is also called check function in the literature , see e.g. @xcite and @xcite for parametric quantile regression and @xcite , @xcite , and @xcite for kernel based quantile regression . we will not address the question how to check whether the assumption of an additive model is satisfied because this would be a topic of a paper of its own . of course , a practical approach might be to fit both models and compare their risks evaluated for test data . for the same reason we will also not cover sparsity . consistency of support vector machines generated by additive kernels for additive models was considered in @xcite . in this paper we establish learning rates for these algorithms . let us recall the framework with a complete separable metric space @xmath3 as the input space and a closed subset @xmath4 of @xmath5 as the output space . a borel probability measure @xmath6 on @xmath7 is used to model the learning problem and an independent and identically distributed sample @xmath8 is drawn according to @xmath6 for learning . a loss function @xmath9 is used to measure the quality of a prediction function @xmath10 by the local error @xmath11 . _ throughout the paper we assume that @xmath12 is measurable , @xmath13 , convex with respect to the third variable , and uniformly lipschitz continuous satisfying @xmath14 with a finite constant @xmath15 . _ support vector machines ( svms ) considered here are kernel  based regularization schemes in a reproducing kernel hilbert space ( rkhs ) @xmath0 generated by a mercer kernel @xmath16 . with a shifted loss function @xmath17 introduced for dealing even with heavy  tailed distributions as @xmath18 , they take the form @xmath19 where for a general borel measure @xmath20 on @xmath21 , the function @xmath22 is defined by @xmath23 where @xmath24 is a regularization parameter . the idea to shift a loss function has a long history , see e.g. @xcite in the context of m  estimators . it was shown in @xcite that @xmath22 is also a minimizer of the following optimization problem involving the original loss function @xmath12 if a minimizer exists : @xmath25 the additive model we consider consists of the _ input space decomposition _ @xmath26 with each @xmath27 a complete separable metric space and a _ hypothesis space _ @xmath28 where @xmath29 is a set of functions @xmath30 each of which is also identified as a map @xmath31 from @xmath3 to @xmath5 . hence the functions from @xmath32 take the additive form @xmath33 . we mention , that there is strictly speaking a notational problem here , because in the previous formula each quantity @xmath34 is an element of the set @xmath35 which is a subset of the full input space @xmath36 , @xmath37 , whereas in the definition of sample @xmath8 each quantity @xmath38 is an element of the full input space @xmath36 , where @xmath39 . because these notations will only be used in different places and because we do not expect any misunderstandings , we think this notation is easier and more intuitive than specifying these quantities with different symbols . the additive kernel @xmath40 is defined in terms of mercer kernels @xmath41 on @xmath27 as @xmath42 it generates an rkhs @xmath0 which can be written in terms of the rkhs @xmath43 generated by @xmath41 on @xmath27 corresponding to the form ( [ additive ] ) as @xmath44 with norm given by @xmath45 the norm of @xmath46 satisfies @xmath47 to illustrate advantages of additive models , we provide two examples of comparing additive with product kernels . the first example deals with gaussian rbf kernels . all proofs will be given in section [ proofsection ] . [ gaussadd ] let @xmath48 , @xmath49 $ ] and @xmath50 ^ 2.$ ] let @xmath51 and @xmath52.\ ] ] the additive kernel @xmath53 is given by @xmath54 furthermore , the product kernel @xmath55 is the standard gaussian kernel given by @xmath56 define a gaussian function @xmath57 on @xmath58 ^ 2 $ ] depending only on one variable by @xmath59 then @xmath60 but @xmath61 where @xmath62 denotes the rkhs generated by the standard gaussian rbf kernel @xmath63 . the second example is about sobolev kernels . [ sobolvadd ] let @xmath64 , @xmath65 $ ] and @xmath58^s.$ ] let @xmath66 : = \bigl\{u\in l_2([0,1 ] ) ; d^\alpha u \in l_2([0,1 ] ) \mbox{~for~all~}\alpha\le 1\bigr\}\ ] ] be the sobolev space consisting of all square integrable univariate functions whose derivative is also square integrable . it is an rkhs with a mercer kernel @xmath67 defined on @xmath68 ^ 2 $ ] . if we take all the mercer kernels @xmath69 to be @xmath67 , then @xmath70 $ ] for each @xmath71 . the additive kernel @xmath72 is also a mercer kernel and defines an rkhs @xmath73\right\}.\ ] ] however , the multivariate sobolev space @xmath74^s)$ ] , consisting of all square integrable functions whose partial derivatives are all square integrable , contains discontinuous functions and is not an rkhs . denote the marginal distribution of @xmath6 on @xmath27 as @xmath75 . under the assumption that @xmath76 for each @xmath71 and that @xmath43 is dense in @xmath29 in the @xmath77metric , it was proved in @xcite that @xmath78 in probability as long as @xmath79 satisfies @xmath80 and @xmath81 . the rest of the paper has the following structure . section [ ratessection ] contains our main results on learning rates for svms based on additive kernels . learning rates for quantile regression are treated as important special cases . section [ comparisonsection ] contains a comparison of our results with other learning rates published recently . section [ proofsection ] contains all the proofs and some results which can be interesting in their own . in this paper we provide some learning rates for the support vector machines generated by additive kernels for additive models which helps improve the quantitative understanding presented in @xcite . the rates are about asymptotic behaviors of the excess risk @xmath82 and take the form @xmath83 with @xmath84 . they will be stated under three kinds of conditions involving the hypothesis space @xmath0 , the measure @xmath6 , the loss @xmath12 , and the choice of the regularization parameter @xmath85 . the first condition is about the approximation ability of the hypothesis space @xmath0 . since the output function @xmath19 is from the hypothesis space , the learning rates of the learning algorithm depend on the approximation ability of the hypothesis space @xmath0 with respect to the optimal risk @xmath86 measured by the following approximation error . [ defapprox ] the approximation error of the triple @xmath87 is defined as @xmath88 to estimate the approximation error , we make an assumption about the minimizer of the risk @xmath89 for each @xmath90 , define the integral operator @xmath91 associated with the kernel @xmath41 by @xmath92 we mention that @xmath93 is a compact and positive operator on @xmath94 . hence we can find its normalized eigenpairs @xmath95 such that @xmath96 is an orthonormal basis of @xmath94 and @xmath97 as @xmath98 . fix @xmath99 . then we can define the @xmath100th power @xmath101 of @xmath93 by @xmath102 this is a positive and bounded operator and its range is well  defined . the assumption @xmath103 means @xmath104 lies in this range . [ assumption1 ] we assume @xmath105 and @xmath106 where for some @xmath107 and each @xmath108 , @xmath109 is a function of the form @xmath110 with some @xmath111 . the case @xmath112 of assumption [ assumption1 ] means each @xmath113 lies in the rkhs @xmath43 . a standard condition in the literature ( e.g. , @xcite ) for achieving decays of the form @xmath114 for the approximation error ( [ approxerrordef ] ) is @xmath115 with some @xmath116 . here the operator @xmath117 is defined by @xmath118 in general , this can not be written in an additive form . however , the hypothesis space ( [ additive ] ) takes an additive form @xmath119 . so it is natural for us to impose an additive expression @xmath120 for the target function @xmath121 with the component functions @xmath113 satisfying the power condition @xmath110 . the above natural assumption leads to a technical difficulty in estimating the approximation error : the function @xmath113 has no direct connection to the marginal distribution @xmath122 projected onto @xmath27 , hence existing methods in the literature ( e.g. , @xcite ) can not be applied directly . note that on the product space @xmath123 , there is no natural probability measure projected from @xmath6 , and the risk on @xmath124 is not defined . our idea to overcome the difficulty is to introduce an intermediate function @xmath125 . it may not minimize a risk ( which is not even defined ) . however , it approximates the component function @xmath113 well . when we add up such functions @xmath126 , we get a good approximation of the target function @xmath121 , and thereby a good estimate of the approximation error . this is the first novelty of the paper . [ approxerrorthm ] under assumption [ assumption1 ] , we have @xmath127 where @xmath128 is the constant given by @xmath129 the second condition for our learning rates is about the capacity of the hypothesis space measured by @xmath130empirical covering numbers . let @xmath131 be a set of functions on @xmath21 and @xmath132 for every @xmath133 the * covering number of @xmath131 * with respect to the empirical metric @xmath134 , given by @xmath135 is defined as @xmath136 and the * @xmath130empirical covering number * of @xmath137 is defined as @xmath138 [ assumption2 ] we assume @xmath139 and that for some @xmath140 , @xmath141 and every @xmath142 , the @xmath130empirical covering number of the unit ball of @xmath43 satisfies @xmath143 the second novelty of this paper is to observe that the additive nature of the hypothesis space yields the following nice bound with a dimension  independent power exponent for the covering numbers of the balls of the hypothesis space @xmath0 , to be proved in section [ samplesection ] . [ capacitythm ] under assumption [ assumption2 ] , for any @xmath144 and @xmath145 , we have @xmath146 the bound for the covering numbers stated in theorem [ capacitythm ] is special : the power @xmath147 is independent of the number @xmath148 of the components in the additive model . it is well  known @xcite in the literature of function spaces that the covering numbers of balls of the sobolev space @xmath149 on the cube @xmath150^s$ ] of the euclidean space @xmath151 with regularity index @xmath152 has the following asymptotic behavior with @xmath153 : @xmath154 here the power @xmath155 depends linearly on the dimension @xmath148 . similar dimension  dependent bounds for the covering numbers of the rkhss associated with gaussian rbf  kernels can be found in @xcite . the special bound in theorem [ capacitythm ] demonstrates an advantage of the additive model in terms of capacity of the additive hypothesis space . the third condition for our learning rates is about the noise level in the measure @xmath6 with respect to the hypothesis space . before stating the general condition , we consider a special case for quantile regression , to illustrate our general results . let @xmath156 be a quantile parameter . the quantile regression function @xmath157 is defined by its value @xmath158 to be a @xmath159quantile of @xmath160 , i.e. , a value @xmath161 satisfying @xmath162 the regularization scheme for quantile regression considered here takes the form ( [ algor ] ) with the loss function @xmath12 given by the pinball loss as @xmath163 a noise condition on @xmath6 for quantile regression is defined in @xcite as follows . to this end , let @xmath164 be a probability measure on @xmath165 and @xmath166 . then a real number @xmath167 is called @xmath159quantile of @xmath164 , if and only if @xmath167 belongs to the set @xmath168\bigr ) \ge \tau \mbox{~~and~~ } q\bigl([t , \infty)\bigr ) \ge 1\tau\bigr\}\,.\ ] ] it is well  known that @xmath169 is a compact interval . [ noisecond ] let @xmath166 . 1 . a probability measure @xmath164 on @xmath165 is said to have a * @xmath159quantile of type @xmath170 * , if there exist a @xmath159quantile @xmath171 and a constant @xmath172 such that , for all @xmath173 $ ] , we have @xmath174 2 . let @xmath175 $ ] . we say that a probability measure @xmath20 on @xmath176 has a * @xmath159quantile of @xmath177average type @xmath170 * if the conditional probability measure @xmath178 has @xmath179almost surely a @xmath159quantile of type @xmath170 and the function @xmath180 where @xmath181 is the constant defined in part ( 1 ) , satisfies @xmath182 . one can show that a distribution @xmath164 having a @xmath159quantile of type @xmath170 has a unique @xmath159quantile @xmath183 . moreover , if @xmath164 has a lebesgue density @xmath184 then @xmath164 has a @xmath159quantile of type @xmath170 if @xmath184 is bounded away from zero on @xmath185 $ ] since we can use @xmath186\}$ ] in ( [ tauquantileoftype2formula ] ) . this assumption is general enough to cover many distributions used in parametric statistics such as gaussian , student s @xmath187 , and logistic distributions ( with @xmath188 ) , gamma and log  normal distributions ( with @xmath189 ) , and uniform and beta distributions ( with @xmath190 $ ] ) . the following theorem , to be proved in section [ proofsection ] , gives a learning rate for the regularization scheme ( [ algor ] ) in the special case of quantile regression . [ quantilethm ] suppose that @xmath191 almost surely for some constant @xmath192 , and that each kernel @xmath41 is @xmath193 with @xmath194 for some @xmath195 . if assumption [ assumption1 ] holds with @xmath112 and @xmath6 has a @xmath159quantile of @xmath177average type @xmath170 for some @xmath196 $ ] , then by taking @xmath197 , for any @xmath198 and @xmath199 , with confidence at least @xmath200 we have @xmath201 where @xmath202 is a constant independent of @xmath203 and @xmath204 and @xmath205 please note that the exponent @xmath206 given by ( [ quantilerates2 ] ) for the learning rate in ( [ quantilerates ] ) is independent of the quantile level @xmath159 , of the number @xmath148 of additive components in @xmath207 , and of the dimensions @xmath208 and @xmath209 further note that @xmath210 , if @xmath211 , and @xmath212 if @xmath213 . because @xmath214 can be arbitrarily close to @xmath215 , the learning rate , which is independent of the dimension @xmath216 and given by theorem [ quantilethm ] , is close to @xmath217 for large values of @xmath177 and is close to @xmath218 or better , if @xmath211 . to state our general learning rates , we need an assumption on a _ variance  expectation bound _ which is similar to definition [ noisecond ] in the special case of quantile regression . [ assumption3 ] we assume that there exist an exponent @xmath219 $ ] and a positive constant @xmath220 such that @xmath221 assumption [ assumption3 ] always holds true for @xmath222 . if the triple @xmath223 satisfies some conditions , the exponent @xmath224 can be larger . for example , when @xmath12 is the pinball loss ( [ pinloss ] ) and @xmath6 has a @xmath159quantile of @xmath177average type @xmath225 for some @xmath196 $ ] and @xmath226 as defined in @xcite , then @xmath227 . [ mainratesthm ] suppose that @xmath228 is bounded by a constant @xmath229 almost surely . under assumptions [ assumption1 ] to [ assumption3 ] , if we take @xmath198 and @xmath230 for some @xmath231 , then for any @xmath232 , with confidence at least @xmath200 we have @xmath233 where @xmath234 is given by @xmath235 and @xmath202 is constant independent of @xmath203 or @xmath204 ( to be given explicitly in the proof ) . we now add some theoretical and numerical comparisons on the goodness of our learning rates with those from the literature . as already mentioned in the introduction , some reasons for the popularity of additive models are flexibility , increased interpretability , and ( often ) a reduced proneness of the curse of high dimensions . hence it is important to check , whether the learning rate given in theorem [ mainratesthm ] under the assumption of an additive model favourably compares to ( essentially ) optimal learning rates without this assumption . in other words , we need to demonstrate that the main goal of this paper is achieved by theorem [ quantilethm ] and theorem [ mainratesthm ] , i.e. that an svm based on an additive kernel can provide a substantially better learning rate in high dimensions than an svm with a general kernel , say a classical gaussian rbf kernel , provided the assumption of an additive model is satisfied . our learning rate in theorem [ quantilethm ] is new and optimal in the literature of svm for quantile regression . most learning rates in the literature of svm for quantile regression are given for projected output functions @xmath236 , while it is well known that projections improve learning rates @xcite . here the projection operator @xmath237 is defined for any measurable function @xmath10 by @xmath238 sometimes this is called clipping . such results are given in @xcite . for example , under the assumptions that @xmath6 has a @xmath159quantile of @xmath177average type @xmath170 , the approximation error condition ( [ approxerrorb ] ) is satisfied for some @xmath239 , and that for some constants @xmath240 , the sequence of eigenvalues @xmath241 of the integral operator @xmath117 satisfies @xmath242 for every @xmath243 , it was shown in @xcite that with confidence at least @xmath200 , @xmath244 where @xmath245 here the parameter @xmath246 measures the capacity of the rkhs @xmath247 and it plays a similar role as half of the parameter @xmath147 in assumption 2 . for a @xmath193 kernel and @xmath112 , one can choose @xmath246 and @xmath147 to be arbitrarily small and the above power index @xmath248 can be taken as @xmath249 . the learning rate in theorem [ quantilethm ] may be improved by relaxing assumption 1 to a sobolev smoothness condition for @xmath121 and a regularity condition for the marginal distribution @xmath250 . for example , one may use a gaussian kernel @xmath251 depending on the sample size @xmath203 and @xcite achieve the approximation error condition ( [ approxerrorb ] ) for some @xmath252 . this is done for quantile regression in @xcite . since we are mainly interested in additive models , we shall not discuss such an extension . [ gaussmore ] let @xmath48 , @xmath49 $ ] and @xmath50 ^ 2.$ ] let @xmath51 and the additive kernel @xmath72 be given by ( [ gaussaddform ] ) with @xmath253 in example [ gaussadd ] as @xmath52.\ ] ] if the function @xmath121 is given by ( [ gaussfcn ] ) , @xmath191 almost surely for some constant @xmath192 , and @xmath6 has a @xmath159quantile of @xmath177average type @xmath170 for some @xmath196 $ ] , then by taking @xmath197 , for any @xmath145 and @xmath199 , ( [ quantilerates ] ) holds with confidence at least @xmath200 . it is unknown whether the above learning rate can be derived by existing approaches in the literature ( e.g. @xcite ) even after projection . note that the kernel in the above example is independent of the sample size . it would be interesting to see whether there exists some @xmath99 such that the function @xmath57 defined by ( [ gaussfcn ] ) lies in the range of the operator @xmath254 . the existence of such a positive index would lead to the approximation error condition ( [ approxerrorb ] ) , see @xcite . let us now add some numerical comparisons on the goodness of our learning rates given by theorem [ mainratesthm ] with those given by @xcite . their corollary 4.12 gives ( essentially ) minmax optimal learning rates for ( clipped ) svms in the context of nonparametric quantile regression using one gaussian rbf kernel on the whole input space under appropriate smoothness assumptions of the target function . let us consider the case that the distribution @xmath6 has a @xmath159quantile of @xmath177average type @xmath170 , where @xmath255 , and assume that both corollary 4.12 in @xcite and our theorem [ mainratesthm ] are applicable . i.e. , we assume in particular that @xmath6 is a probability measure on @xmath256 $ ] and that the marginal distribution @xmath257 has a lebesgue density @xmath258 for some @xmath259 . furthermore , suppose that the optimal decision function @xmath260 has ( to make theorem [ mainratesthm ] applicable with @xmath261 $ ] ) the additive structure @xmath207 with each @xmath104 as stated in assumption [ assumption1 ] , where @xmath262 and @xmath263 , with minimal risk @xmath86 and additionally fulfills ( to make corollary 4.12 in @xcite applicable ) @xmath264 where @xmath265 $ ] and @xmath266 denotes a besov space with smoothness parameter @xmath267 . the intuitive meaning of @xmath248 is , that increasing values of @xmath248 correspond to increased smoothness . we refer to ( * ? ? ? * and p. 44 ) for details on besov spaces . it is well  known that the besov space @xmath268 contains the sobolev space @xmath269 for @xmath270 , @xmath271 , and @xmath272 , and that @xmath273 . we mention that if all @xmath41 are suitably chosen wendland kernels , their reproducing kernel hilbert spaces @xmath43 are sobolev spaces , see ( * ? ? ? * thm . 10.35 , p. 160 ) . furthermore , we use the same sequence of regularizing parameters as in ( * ? ? ? 4.9 , cor . 4.12 ) , i.e. , @xmath274 where @xmath275 , @xmath276 , @xmath277 $ ] , and @xmath278 is some user  defined positive constant independent of @xmath279 . for reasons of simplicity , let us fix @xmath280 . then ( * ? ? ? 4.12 ) gives learning rates for the risk of svms for @xmath159quantile regression , if a single gaussian rbf  kernel on @xmath281 is used for @xmath159quantile functions of @xmath177average type @xmath170 with @xmath255 , which are of order @xmath282 hence the learning rate in theorem [ quantilethm ] is better than the one in ( * ? ? ? 4.12 ) in this situation , if @xmath283 provided the assumption of the additive model is valid . table [ table1 ] lists the values of @xmath284 from ( [ explicitratescz2 ] ) for some finite values of the dimension @xmath216 , where @xmath285 . all of these values of @xmath284 are positive with the exceptions if @xmath286 or @xmath287 . this is in contrast to the corresponding exponent in the learning rate by ( * ? ? * cor . 4.12 ) , because @xmath288 table [ table2 ] and figures [ figure1 ] to [ figure2 ] give additional information on the limit @xmath289 . of course , higher values of the exponent indicates faster rates of convergence . it is obvious , that an svm based on an additive kernel has a significantly faster rate of convergence in higher dimensions @xmath216 compared to svm based on a single gaussian rbf kernel defined on the whole input space , of course under the assumption that the additive model is valid . the figures seem to indicate that our learning rate from theorem [ mainratesthm ] is probably not optimal for small dimensions . however , the main focus of the present paper is on high dimensions . .[table1 ] the table lists the limits of the exponents @xmath290 from ( * ? ? ? * cor . 4.12 ) and @xmath291 from theorem [ mainratesthm ] , respectively , if the regularizing parameter @xmath292 is chosen in an optimal manner for the nonparametric setup , i.e. @xmath293 , with @xmath294 for @xmath295 and @xmath296 . recall that @xmath297 $ ] . [ cols= " > , > , > , > " , ]"  "additive models play an important role in semiparametric statistics .
this paper gives learning rates for regularized kernel based methods for additive models .
these learning rates compare favourably in particular in high dimensions to recent results on optimal learning rates for purely nonparametric regularized kernel based quantile regression using the gaussian radial basis function kernel , provided the assumption of an additive model is valid .
additionally , a concrete example is presented to show that a gaussian function depending only on one variable lies in a reproducing kernel hilbert space generated by an additive gaussian kernel , but does not belong to the reproducing kernel hilbert space generated by the multivariate gaussian kernel of the same variance . *
key words and phrases . * additive model , kernel , quantile regression , semiparametric , rate of convergence , support vector machine ." 
"the leptonic decays of a charged pseudoscalar meson @xmath7 are processes of the type @xmath8 , where @xmath9 , @xmath10 , or @xmath11 . because no strong interactions are present in the leptonic final state @xmath12 , such decays provide a clean way to probe the complex , strong interactions that bind the quark and antiquark within the initial  state meson . in these decays , strong interaction effects can be parametrized by a single quantity , @xmath13 , the pseudoscalar meson decay constant . the leptonic decay rate can be measured by experiment , and the decay constant can be determined by the equation ( ignoring radiative corrections ) @xmath14 where @xmath15 is the fermi coupling constant , @xmath16 is the cabibbo  kobayashi  maskawa ( ckm ) matrix @xcite element , @xmath17 is the mass of the meson , and @xmath18 is the mass of the charged lepton . the quantity @xmath13 describes the amplitude for the @xmath19 and @xmath20quarks within the @xmath21 to have zero separation , a condition necessary for them to annihilate into the virtual @xmath22 boson that produces the @xmath12 pair . the experimental determination of decay constants is one of the most important tests of calculations involving nonperturbative qcd . such calculations have been performed using various models @xcite or using lattice qcd ( lqcd ) . the latter is now generally considered to be the most reliable way to calculate the quantity . knowledge of decay constants is important for describing several key processes , such as @xmath23 mixing , which depends on @xmath24 , a quantity that is also predicted by lqcd calculations . experimental determination @xcite of @xmath24 with the leptonic decay of a @xmath25 meson is , however , very limited as the rate is highly suppressed due to the smallness of the magnitude of the relevant ckm matrix element @xmath26 . the charm mesons , @xmath27 and @xmath28 , are better instruments to study the leptonic decays of heavy mesons since these decays are either less ckm suppressed or favored , _ i.e. _ , @xmath29 and @xmath30 are much larger than @xmath31 . thus , the decay constants @xmath32 and @xmath33 determined from charm meson decays can be used to test and validate the necessary lqcd calculations applicable to the @xmath34meson sector . among the leptonic decays in the charm  quark sector , @xmath35 decays are more accessible since they are ckm favored . furthermore , the large mass of the @xmath11 lepton removes the helicity suppression that is present in the decays to lighter leptons . the existence of multiple neutrinos in the final state , however , makes measurement of this decay challenging . physics beyond the standard model ( sm ) might also affect leptonic decays of charmed mesons . depending on the non  sm features , the ratio of @xmath36 could be affected @xcite , as could the ratio @xcite @xmath37 . any of the individual widths might be increased or decreased . there is an indication of a discrepancy between the experimental determinations @xcite of @xmath33 and the most recent precision lqcd calculation @xcite . this disagreement is particularly puzzling since the cleo  c determination @xcite of @xmath32 agrees well with the lqcd calculation @xcite of that quantity . some @xcite conjecture that this discrepancy may be explained by a charged higgs boson or a leptoquark . in this article , we report an improved measurement of the absolute branching fraction of the leptonic decay @xmath0 ( charge  conjugate modes are implied ) , with @xmath1 , from which we determine the decay constant @xmath33 . we use a data sample of @xmath38 events provided by the cornell electron storage ring ( cesr ) and collected by the cleo  c detector at the center  of  mass ( cm ) energy @xmath39 mev , near @xmath3 peak production @xcite . the data sample consists of an integrated luminosity of @xmath40 @xmath41 containing @xmath42 @xmath3 pairs . we have previously reported @xcite measurements of @xmath43 and @xmath0 with a subsample of these data . a companion article @xcite reports measurements of @xmath33 from @xmath43 and @xmath0 , with @xmath44 , using essentially the same data sample as the one used in this measurement . the cleo  c detector @xcite is a general  purpose solenoidal detector with four concentric components utilized in this measurement : a small  radius six  layer stereo wire drift chamber , a 47layer main drift chamber , a ring  imaging cherenkov ( rich ) detector , and an electromagnetic calorimeter consisting of 7800 csi(tl ) crystals . the two drift chambers operate in a @xmath45 t magnetic field and provide charged particle tracking in a solid angle of @xmath46% of @xmath47 . the chambers achieve a momentum resolution of @xmath48% at @xmath49 gev/@xmath50 . the main drift chamber also provides specific  ionization ( @xmath51 ) measurements that discriminate between charged pions and kaons . the rich detector covers approximately @xmath52% of @xmath47 and provides additional separation of pions and kaons at high momentum . the photon energy resolution of the calorimeter is @xmath53% at @xmath54 gev and @xmath55% at @xmath56 mev . electron identification is based on a likelihood variable that combines the information from the rich detector , @xmath51 , and the ratio of electromagnetic shower energy to track momentum ( @xmath57 ) . we use a geant  based @xcite monte carlo ( mc ) simulation program to study efficiency of signal  event selection and background processes . physics events are generated by evtgen @xcite , tuned with much improved knowledge of charm decays @xcite , and final  state radiation ( fsr ) is modeled by the photos @xcite program . the modeling of initial  state radiation ( isr ) is based on cross sections for @xmath3 production at lower energies obtained from the cleo  c energy scan @xcite near the cm energy where we collect the sample . the presence of two @xmath58 mesons in a @xmath3 event allows us to define a single  tag ( st ) sample in which a @xmath58 is reconstructed in a hadronic decay mode and a further double  tagged ( dt ) subsample in which an additional @xmath59 is required as a signature of @xmath60 decay , the @xmath59 being the daughter of the @xmath60 . the @xmath61 reconstructed in the st sample can be either primary or secondary from @xmath62 ( or @xmath63 ) . the st yield can be expressed as @xmath64 where @xmath65 is the produced number of @xmath3 pairs , @xmath66 is the branching fraction of hadronic modes used in the st sample , and @xmath67 is the st efficiency . the @xmath68 counts the candidates , not events , and the factor of 2 comes from the sum of @xmath28 and @xmath61 tags . our double  tag ( dt ) sample is formed from events with only a single charged track , identified as an @xmath69 , in addition to a st . the yield can be expressed as @xmath70 where @xmath71 is the leptonic decay branching fraction , including the subbranching fraction of @xmath1 decay , @xmath72 is the efficiency of finding the st and the leptonic decay in the same event . from the st and dt yields we can obtain an absolute branching fraction of the leptonic decay @xmath71 , without needing to know the integrated luminosity or the produced number of @xmath3 pairs , @xmath73 where @xmath74 ( @xmath75 ) is the effective signal efficiency . because of the large solid angle acceptance with high segmentation of the cleo  c detector and the low multiplicity of the events with which we are concerned , @xmath76 , where @xmath77 is the leptonic decay efficiency . hence , the ratio @xmath78 is insensitive to most systematic effects associated with the st , and the signal branching fraction @xmath71 obtained using this procedure is nearly independent of the efficiency of the tagging mode . to minimize systematic uncertainties , we tag using three two  body hadronic decay modes with only charged particles in the final state . the three st modes and @xmath79 are shorthand labels for @xmath80 events within mass windows ( described below ) of the @xmath81 peak in @xmath82 and the @xmath83 peak in @xmath84 , respectively . no attempt is made to separate these resonance components in the @xmath85 dalitz plot . ] are @xmath86 , @xmath79 , and @xmath87 . using these tag modes also helps to reduce the tag bias which would be caused by the correlation between the tag side and the signal side reconstruction if tag modes with high multiplicity and large background were used . the effect of the tag bias @xmath88 can be expressed in terms of the signal efficiency @xmath74 defined by @xmath89 where @xmath90 is the st efficiency when the recoiling system is the signal leptonic decay with single @xmath59 in the other side of the tag . as the general st efficiency @xmath67 , when the recoiling system is any possible @xmath91 decays , will be lower than the @xmath90 , sizable tag bias could be introduced if the multiplicity of the tag mode were high , or the tag mode were to include neutral particles in the final state . as shown in sec . [ sec : results ] , this effect is negligible in our chosen clean tag modes . the @xmath92 decay is reconstructed by combining oppositely charged tracks that originate from a common vertex and that have an invariant mass within @xmath93 mev of the nominal mass @xcite . we require the resonance decay to satisfy the following mass windows around the nominal masses @xcite : @xmath94 ( @xmath95 mev ) and @xmath96 ( @xmath97 mev ) . we require the momenta of charged particles to be @xmath56 mev or greater to suppress the slow pion background from @xmath98 decays ( through @xmath99 ) . we identify a st by using the invariant mass of the tag @xmath100 and recoil mass against the tag @xmath101 . the recoil mass is defined as @xmath102 where @xmath103 is the net four  momentum of the @xmath4 beam , taking the finite beam crossing angle into account ; @xmath104 is the four  momentum of the tag , with @xmath105 computed from @xmath106 and the nominal mass @xcite of the @xmath91 meson . we require the recoil mass to be within @xmath107 mev of the @xmath108 mass @xcite . this loose window allows both primary and secondary @xmath91 tags to be selected . to estimate the backgrounds in our st and dt yields from the wrong tag combinations ( incorrect combinations that , by chance , lie within the @xmath109 signal region ) , we use the tag invariant mass sidebands . we define the signal region as @xmath110 mev @xmath111 mev , and the sideband regions as @xmath112 mev @xmath113 mev or @xmath114 mev @xmath115 mev , where @xmath116 is the difference between the tag mass and the nominal mass . we fit the st @xmath109 distributions to the sum of double  gaussian signal function plus second  degree chebyshev polynomial background function to get the tag mass sideband scaling factor . the invariant mass distributions of tag candidates for each tag mode are shown in fig . [ fig : dm ] and the st yield and @xmath109 sideband scaling factor are summarized in table [ table : data  single ] . we find @xmath117 summed over the three tag modes . .[table : data  single ] summary of single  tag ( st ) yields , where @xmath118 is the yield in the st mass signal region , @xmath119 is the yield in the sideband region , @xmath120 is the sideband scaling factor , and @xmath68 is the scaled sideband  subtracted yield . [ cols="<,>,>,>,>",options="header " , ] we considered six semileptonic decays , @xmath121 @xmath122 , @xmath123 , @xmath124 , @xmath125 , @xmath126 , and @xmath127 , as the major sources of background in the @xmath128 signal region . the second dominates the nonpeaking background , and the fourth ( with @xmath129 ) dominates the peaking background . uncertainty in the signal yield due to nonpeaking background ( @xmath130 ) is assessed by varying the semileptonic decay branching fractions by the precision with which they are known @xcite . imperfect knowledge of @xmath131 gives rise to a systematic uncertainty in our estimate of the amount of peaking background in the signal region , which has an effect on our branching fraction measurement of @xmath132 . we study differences in efficiency , data vs mc events , due to the extra energy requirement , extra track veto , and @xmath133 requirement , by using samples from data and mc events , in which _ both _ the @xmath134 and @xmath2 satisfy our tag requirements , i.e. , `` double  tag '' events . we then apply each of the above  mentioned requirements and compare loss in efficiency of data vs mc events . in this way we obtain a correction of @xmath135 for the extra energy requirement and systematic uncertainties on each of the three requirements of @xmath136 ( all equal , by chance ) . the non@xmath69 background in the signal @xmath69 candidate sample is negligible ( @xmath137 ) due to the low probability ( @xmath138 per track ) that hadrons ( @xmath139 or @xmath140 ) are misidentified as @xmath69 @xcite . uncertainty in these backgrounds produces a @xmath141 uncertainty in the measurement of @xmath142 . the secondary @xmath69 backgrounds from charge symmetric processes , such as @xmath143 dalitz decay ( @xmath144 ) and @xmath145 conversion ( @xmath146 ) , are assessed by measuring the wrong  sign signal electron in events with @xmath147 . the uncertainty in the measurement from this source is estimated to be @xmath148 . other possible sources of systematic uncertainty include @xmath68 ( @xmath137 ) , tag bias ( @xmath149 ) , tracking efficiency ( @xmath148 ) , @xmath59 identification efficiency ( @xmath150 ) , and fsr ( @xmath150 ) . combining all contributions in quadrature , the total systematic uncertainty in the branching fraction measurement is estimated to be @xmath151 . in summary , using the sample of @xmath152 tagged @xmath28 decays with the cleo  c detector we obtain the absolute branching fraction of the leptonic decay @xmath153 through @xmath154 @xmath155 where the first uncertainty is statistical and the second is systematic . this result supersedes our previous measurement @xcite of the same branching fraction , which used a subsample of data used in this work . the decay constant @xmath33 can be computed using eq . ( [ eq : f ] ) with known values @xcite @xmath156 gev@xmath157 , @xmath158 mev , @xmath159 mev , and @xmath160 s. we assume @xmath161 and use the value @xmath162 given in ref . we obtain @xmath163 combining with our other determination @xcite of @xmath164 mev with @xmath43 and @xmath0 ( @xmath165 ) decays , we obtain @xmath166 this result is derived from absolute branching fractions only and is the most precise determination of the @xmath91 leptonic decay constant to date . our combined result is larger than the recent lqcd calculation @xmath167 mev @xcite by @xmath168 standard deviations . the difference between data and lqcd for @xmath33 could be due to physics beyond the sm @xcite , unlikely statistical fluctuations in the experimental measurements or the lqcd calculation , or systematic uncertainties that are not understood in the lqcd calculation or the experimental measurements . combining with our other determination @xcite of @xmath169 , via @xmath44 , we obtain @xmath170 using this with our measurement @xcite of @xmath171 , we obtain the branching fraction ratio @xmath172 this is consistent with @xmath173 , the value predicted by the sm with lepton universality , as given in eq . ( [ eq : f ] ) with known masses @xcite . we gratefully acknowledge the effort of the cesr staff in providing us with excellent luminosity and running conditions . d. cronin  hennessy and a. ryd thank the a.p . sloan foundation . this work was supported by the national science foundation , the u.s . department of energy , the natural sciences and engineering research council of canada , and the u.k . science and technology facilities council . c. amsler _ et al . _ ( particle data group ) , phys . b * 667 * , 1 ( 2008 ) . k. ikado _ et al . _ ( belle collaboration ) , phys . lett . * 97 * , 251802 ( 2006 ) . b. aubert _ et al . _ ( babar collaboration ) , phys . rev . d * 77 * , 011107 ( 2008 ) . a. g. akeroyd and c. h. chen , phys . d * 75 * , 075004 ( 2007 ) ; a. g. akeroyd , prog . phys . * 111 * , 295 ( 2004 ) . j. l. hewett , arxiv : hep  ph/9505246 . w. s. hou , phys . d * 48 * , 2342 ( 1993 ) . e. follana , c. t. h. davies , g. p. lepage , and j. shigemitsu ( hpqcd collaboration ) , phys . lett . * 100 * , 062002 ( 2008 ) . b. i. eisenstein _ et al . _ ( cleo collaboration ) , phys . rev . d * 78 * , 052003 ( 2008 ) . b. a. dobrescu and a. s. kronfeld , phys . * 100 * , 241802 ( 2008 ) . d. cronin  hennessy _ et al . _ ( cleo collaboration ) , arxiv:0801.3418 . m. artuso _ et al . _ ( cleo collaboration ) , phys . lett . * 99 * , 071802 ( 2007 ) . k. m. ecklund _ et al . _ ( cleo collaboration ) , phys . rev . lett . * 100 * , 161801 ( 2008 ) . j. p. alexander _ et al . _ ( cleo collaboration ) , phys . rev . d * 79 * , 052001 ( 2009 ) . y. kubota _ et al . _ ( cleo collaboration ) , nucl . instrum . a * 320 * , 66 ( 1992 ) . d. peterson _ et al . _ , instrum . methods phys . , sec . a * 478 * , 142 ( 2002 ) . m. artuso _ et al . _ , nucl . instrum . methods phys . a * 502 * , 91 ( 2003 ) . s. dobbs _ et al . _ ( cleo collaboration ) , phys . rev . d * 76 * , 112001 ( 2007 ) . j. p. alexander _ et al . _ ( cleo collaboration ) , phys . rev . lett . * 100 * , 161804 ( 2008 ) . e. barberio and z. was , comput . . commun . * 79 * , 291 ( 1994 ) ."  "we have studied the leptonic decay @xmath0 , via the decay channel @xmath1 , using a sample of tagged @xmath2 decays collected near the @xmath3 peak production energy in @xmath4 collisions with the cleo  c detector .
we obtain @xmath5 and determine the decay constant @xmath6 mev , where the first uncertainties are statistical and the second are systematic ." 
"the transport properties of nonlinear non  equilibrium dynamical systems are far from well  understood@xcite . consider in particular so  called ratchet systems which are asymmetric periodic potentials where an ensemble of particles experience directed transport@xcite . the origins of the interest in this lie in considerations about extracting useful work from unbiased noisy fluctuations as seems to happen in biological systems@xcite . recently attention has been focused on the behavior of deterministic chaotic ratchets@xcite as well as hamiltonian ratchets@xcite . chaotic systems are defined as those which are sensitively dependent on initial conditions . whether chaotic or not , the behavior of nonlinear systems including the transition from regular to chaotic behavior is in general sensitively dependent on the parameters of the system . that is , the phase  space structure is usually relatively complicated , consisting of stability islands embedded in chaotic seas , for examples , or of simultaneously co  existing attractors . this can change significantly as parameters change . for example , stability islands can merge into each other , or break apart , and the chaotic sea itself may get pinched off or otherwise changed , or attractors can change symmetry or bifurcate . this means that the transport properties can change dramatically as well . a few years ago , mateos@xcite considered a specific ratchet model with a periodically forced underdamped particle . he looked at an ensemble of particles , specifically the velocity for the particles , averaged over time and the entire ensemble . he showed that this quantity , which is an intuitively reasonable definition of ` the current ' , could be either positive or negative depending on the amplitude @xmath0 of the periodic forcing for the system . at the same time , there exist ranges in @xmath0 where the trajectory of an individual particle displays chaotic dynamics . mateos conjectured a connection between these two phenomena , specifically that the reversal of current direction was correlated with a bifurcation from chaotic to periodic behavior in the trajectory dynamics . even though it is unlikely that such a result would be universally valid across all chaotic deterministic ratchets , it would still be extremely useful to have general heuristic rules such as this . these organizing principles would allow some handle on characterizing the many different kinds of behavior that are possible in such systems . a later investigation@xcite of the mateos conjecture by barbi and salerno , however , argued that it was not a valid rule even in the specific system considered by mateos . they presented results showing that it was possible to have current reversals in the absence of bifurcations from periodic to chaotic behavior . they proposed an alternative origin for the current reversal , suggesting it was related to the different stability properties of the rotating periodic orbits of the system . these latter results seem fundamentally sensible . however , this paper based its arguments about currents on the behavior of a _ single _ particle as opposed to an ensemble . this implicitly assumes that the dynamics of the system are ergodic . this is not true in general for chaotic systems of the type being considered . in particular , there can be extreme dependence of the result on the statistics of the ensemble being considered . this has been pointed out in earlier studies @xcite which laid out a detailed methodology for understanding transport properties in such a mixed regular and chaotic system . depending on specific parameter value , the particular system under consideration has multiple coexisting periodic or chaotic attractors or a mixture of both . it is hence appropriate to understand how a probability ensemble might behave in such a system . the details of the dependence on the ensemble are particularly relevant to the issue of the possible experimental validation of these results , since experiments are always conducted , by virtue of finite  precision , over finite time and finite ensembles . it is therefore interesting to probe the results of barbi and salerno with regard to the details of the ensemble used , and more formally , to see how ergodicity alters our considerations about the current , as we do in this paper . we report here on studies on the properties of the current in a chaotic deterministic ratchet , specifically the same system as considered by mateos@xcite and barbi and salerno@xcite . we consider the impact of different kinds of ensembles of particles on the current and show that the current depends significantly on the details of the initial ensemble . we also show that it is important to discard transients in quantifying the current . this is one of the central messages of this paper : broad heuristics are rare in chaotic systems , and hence it is critical to understand the ensemble  dependence in any study of the transport properties of chaotic ratchets . having established this , we then proceed to discuss the connection between the bifurcation diagram for individual particles and the behavior of the current . we find that while we disagree with many of the details of barbi and salerno s results , the broader conclusion still holds . that is , it is indeed possible to have current reversals in the absence of bifurcations from chaos to periodic behavior as well as bifurcations without any accompanying current reversals . the result of our investigation is therefore that the transport properties of a chaotic ratchet are not as simple as the initial conjecture . however , we do find evidence for a generalized version of mateos s conjecture . that is , in general , bifurcations for trajectory dynamics as a function of system parameter seem to be associated with abrupt changes in the current . depending on the specific value of the current , these abrupt changes may lead the net current to reverse direction , but not necessarily so . we start below with a preparatory discussion necessary to understand the details of the connection between bifurcations and current reversal , where we discuss the potential and phase  space for single trajectories for this system , where we also define a bifurcation diagram for this system . in the next section , we discuss the subtleties of establishing a connection between the behavior of individual trajectories and of ensembles . after this , we are able to compare details of specific trajectory bifurcation curves with current curves , and thus justify our broader statements above , after which we conclude . the goal of these studies is to understand the behavior of general chaotic ratchets . the approach taken here is that to discover heuristic rules we must consider specific systems in great detail before generalizing . we choose the same @xmath1dimensional ratchet considered previously by mateos@xcite , as well as barbi and salerno@xcite . we consider an ensemble of particles moving in an asymmetric periodic potential , driven by a periodic time  dependent external force , where the force has a zero time  average . there is no noise in the system , so it is completely deterministic , although there is damping . the equations of motion for an individual trajectory for such a system are given in dimensionless variables by @xmath2 where the periodic asymmetric potential can be written in the form @xmath3 + \frac{1}{4 } \sin [ 4\pi ( x x_0 ) ] \bigg ] .\ ] ] in this equation @xmath4 have been introduced for convenience such that one potential minimum exists at the origin with @xmath5 and the term @xmath6 . ( a ) classical phase space for the unperturbed system . for @xmath7 , @xmath8 , two chaotic attractors emerge with @xmath9 ( b ) @xmath10 ( c ) and a period four attractor consisting of the four centers of the circles with @xmath11.,title="fig:",width=302 ] the phase  space of the undamped undriven ratchet the system corresponding to the unperturbed potential @xmath12 looks like a series of asymmetric pendula . that is , individual trajectories have one of following possible time  asymptotic behaviors : ( i ) inside the potential wells , trajectories and all their properties oscillate , leading to zero net transport . outside the wells , the trajectories either ( ii ) librate to the right or ( iii ) to the left , with corresponding net transport depending upon initial conditions . there are also ( iv ) trajectories on the separatrices between the oscillating and librating orbits , moving between unstable fixed points in infinite time , as well as the unstable and stable fixed points themselves , all of which constitute a set of negligible measure . when damping is introduced via the @xmath13dependent term in eq . [ eq : dyn ] , it makes the stable fixed points the only attractors for the system . when the driving is turned on , the phase  space becomes chaotic with the usual phenomena of intertwining separatrices and resulting homoclinic tangles . the dynamics of individual trajectories in such a system are now very complicated in general and depend sensitively on the choice of parameters and initial conditions . we show snapshots of the development of this kind of chaos in the set of poincar sections fig . ( [ figure1]b , c ) together with a period  four orbit represented by the center of the circles . a broad characterization of the dynamics of the problem as a function of a parameter ( @xmath14 or @xmath15 ) emerges in a bifurcation diagram . this can be constructed in several different and essentially equivalent ways . the relatively standard form that we use proceeds as follows : first choose the bifurcation parameter ( let us say @xmath0 ) and correspondingly choose fixed values of @xmath16 , and start with a given value for @xmath17 . now iterate an initial condition , recording the value of the particle s position @xmath18 at times @xmath19 from its integrated trajectory ( sometimes we record @xmath20 . this is done stroboscopically at discrete times @xmath21 where @xmath22 and @xmath23 is an integer @xmath24 with @xmath25 the maximum number of observations made . of these , discard observations at times less than some cut  off time @xmath26 and plot the remaining points against @xmath27 . it must be noted that discarding transient behavior is critical to get results which are independent of initial condition , and we shall emphasize this further below in the context of the net transport or current . if the system has a fixed  point attractor then all of the data lie at one particular location @xmath28 . a periodic orbit with period @xmath29 ( that is , with period commensurate with the driving ) shows up with @xmath30 points occupying only @xmath31 different locations of @xmath32 for @xmath27 . all other orbits , including periodic orbits of incommensurate period result in a simply  connected or multiply  connected dense set of points . for the next value @xmath33 , the last computed value of @xmath34 at @xmath35 are used as initial conditions , and previously , results are stored after cutoff and so on until @xmath36 . that is , the bifurcation diagram is generated by sweeping the relevant parameter , in this case @xmath0 , from @xmath27 through some maximum value @xmath37 . this procedure is intended to catch all coexisting attractors of the system with the specified parameter range . note that several initial conditions are effectively used troughout the process , and a bifurcation diagram is not the behavior of a single trajectory . we have made several plots , as a test , with different initial conditions and the diagrams obtained are identical . we show several examples of this kind of bifurcation diagram below , where they are being compared with the corresponding behavior of the current . having broadly understood the wide range of behavior for individual trajectories in this system , we now turn in the next section to a discussion of the non  equilibrium properties of a statistical ensemble of these trajectories , specifically the current for an ensemble . the current @xmath38 for an ensemble in the system is defined in an intuitive manner by mateos@xcite as the time  average of the average velocity over an ensemble of initial conditions . that is , an average over several initial conditions is performed at a given observation time @xmath39 to yield the average velocity over the particles @xmath40 this average velocity is then further time  averaged ; given the discrete time @xmath39 for observation this leads to a second sum @xmath41 where @xmath25 is the number of time  observations made . for this to be a relevant quantity to compare with bifurcation diagrams , @xmath38 should be independent of the quantities @xmath42 but still strongly dependent on @xmath43 . a further parameter dependence that is being suppressed in the definition above is the shape and location of the ensemble being used . that is , the transport properties of an ensemble in a chaotic system depend in general on the part of the phase  space being sampled . it is therefore important to consider many different initial conditions to generate a current . the first straightforward result we show in fig . ( [ figure2 ] ) is that in the case of chaotic trajectories , a single trajectory easily displays behavior very different from that of many trajectories . however , it turns out that in the regular regime , it is possible to use a single trajectory to get essentially the same result as obtained from many trajectories . further consider the bifurcation diagram in fig . ( [ figure3 ] ) where we superimpose the different curves resulting from varying the number of points in the initial ensemble . first , the curve is significantly smoother as a function of @xmath0 for larger @xmath44 . even more relevant is the fact that the single trajectory data ( @xmath45 ) may show current reversals that do not exist in the large @xmath44 data . current @xmath38 versus the number of trajectories @xmath44 for @xmath7 ; dashed lines correspond to a regular motion with @xmath46 while solid lines correspond to a chaotic motion with @xmath47 . note that a single trajectory is sufficient for a regular motion while the convergence in the chaotic case is only obtained if the @xmath44 exceeds a certain threshold , @xmath48.,title="fig:",width=302 ] current @xmath38 versus @xmath0 for different set of trajectories @xmath44 ; @xmath45 ( circles ) , @xmath49 ( square ) and @xmath50 ( dashed lines ) . note that a single trajectory suffices in the regular regime where all the curves match . in the chaotic regime , as @xmath44 increases , the curves converge towards the dashed one.,title="fig:",width=302 ] also , note that single  trajectory current values are typically significantly greater than ensemble averages . this arises from the fact that an arbitrarily chosen ensemble has particles with idiosyncratic behaviors which often average out . as our result , with these ensembles we see typical @xmath51 for example , while barbi and salerno report currents about @xmath52 times greater . however , it is not true that only a few trajectories dominate the dynamics completely , else there would not be a saturation of the current as a function of @xmath44 . all this is clear in fig . ( [ figure3 ] ) . we note that the * net * drift of an ensemble can be a lot closer to @xmath53 than the behavior of an individual trajectory . it should also be clear that there is a dependence of the current on the location of the initial ensemble , this being particularly true for small @xmath44 , of course . the location is defined by its centroid @xmath54 . for @xmath45 , it is trivially true that the initial location matters to the asymptotic value of the time  averaged velocity , given that this is a non  ergodic and chaotic system . further , considering a gaussian ensemble , say , the width of the ensemble also affects the details of the current , and can show , for instance , illusory current reversal , as seen in figs . ( [ current  bifur1],[current  bifur2 ] ) for example . notice also that in fig . ( [ current  bifur1 ] ) , at @xmath55 and @xmath56 , the deviations between the different ensembles is particularly pronounced . these points are close to bifurcation points where some sort of symmetry breaking is clearly occuring , which underlines our emphasis on the relevance of specifying ensemble characteristics in the neighborhood of unstable behavior . however , why these specific bifurcations should stand out among all the bifurcations in the parameter range shown is not entirely clear . to understand how to incorporate this knowledge into calculations of the current , therefore , consider the fact that if we look at the classical phase space for the hamiltonian or underdamped @xmath57 motion , we see the typical structure of stable islands embedded in a chaotic sea which have quite complicated behavior@xcite . in such a situation , the dynamics always depends on the location of the initial conditions . however , we are not in the hamiltonian situation when the damping is turned on in this case , the phase  space consists in general of attractors . that is , if transient behavior is discarded , the current is less likely to depend significantly on the location of the initial conditions or on the spread of the initial conditions . in particular , in the chaotic regime of a non  hamiltonian system , the initial ensemble needs to be chosen larger than a certain threshold to ensure convergence . however , in the regular regime , it is not important to take a large ensemble and a single trajectory can suffice , as long as we take care to discard the transients . that is to say , in the computation of currents , the definition of the current needs to be modified to : @xmath58 where @xmath59 is some empirically obtained cut  off such that we get a converged current ( for instance , in our calculations , we obtained converged results with @xmath60 ) . when this modified form is used , the convergence ( ensemble  independence ) is more rapid as a function of @xmath61 and the width of the intial conditions . armed with this background , we are now finally in a position to compare bifurcation diagrams with the current , as we do in the next section . our results are presented in the set of figures fig . ( [ figure5 ] ) fig . ( [ rev  nobifur ] ) , in each of which we plot both the ensemble current and the bifurcation diagram as a function of the parameter @xmath0 . the main point of these numerical results can be distilled into a series of heuristic statements which we state below ; these are labelled with roman numerals . for @xmath7 and @xmath8 , we plot current ( upper ) with @xmath62 and bifurcation diagram ( lower ) versus @xmath0 . note that there is a * single * current reversal while there are many bifurcations visible in the same parameter range.,title="fig:",width=302 ] consider fig . ( [ figure5 ] ) , which shows the parameter range @xmath63 chosen relatively arbitrarily . in this figure , we see several period  doubling bifurcations leading to order  chaos transitions , such as for example in the approximate ranges @xmath64 . however , there is only one instance of current  reversal , at @xmath65 . note , however , that the current is not without structure it changes fairly dramatically as a function of parameter . this point is made even more clearly in fig . ( [ figure6 ] ) where the current remains consistently below @xmath53 , and hence there are in fact , no current reversals at all . note again , however , that the current has considerable structure , even while remaining negative . for @xmath66 and @xmath8 , plotted are current ( upper ) and bifurcation diagram ( lower ) versus @xmath0 with @xmath62 . notice the current stays consistently below @xmath53.,title="fig:",width=302 ] current and bifurcations versus @xmath0 . in ( a ) and ( b ) we show ensemble dependence , specifically in ( a ) the black curve is for an ensemble of trajectories starting centered at the stable fixed point @xmath67 with a root  mean  square gaussian width of @xmath68 , and the brown curve for trajectories starting from the unstable fixed point @xmath69 and of width @xmath68 . in ( b ) , all ensembles are centered at the stable fixed point , the black line for an ensemble of width @xmath68 , brown a width of @xmath70 and maroon with width @xmath71 . ( c ) is the comparison of the current @xmath38 without transients ( black ) and with transients ( brown ) along with the single  trajectory results in blue ( after barbi and salerno ) . the initial conditions for the ensembles are centered at @xmath67 with a mean root square gaussian of width @xmath68 . ( d ) is the corresponding bifurcation diagram.,title="fig:",width=302 ] it is possible to find several examples of this at different parameters , leading to the negative conclusion , therefore , that * ( i ) not all bifurcations lead to current reversal*. however , we are searching for positive correlations , and at this point we have not precluded the more restricted statement that all current reversals are associated with bifurcations , which is in fact mateos conjecture . we therefore now move onto comparing our results against the specific details of barbi and salerno s treatment of this conjecture . in particular , we look at their figs . ( 2,3a,3b ) , where they scan the parameter region @xmath72 . the distinction between their results and ours is that we are using _ ensembles _ of particles , and are investigating the convergence of these results as a function of number of particles @xmath44 , the width of the ensemble in phase  space , as well as transience parameters @xmath73 . our data with larger @xmath44 yields different results in general , as we show in the recomputed versions of these figures , presented here in figs . ( [ current  bifur1],[current  bifur2 ] ) . specifically , ( a ) the single  trajectory results are , not surprisingly , cleaner and can be more easily interpreted as part of transitions in the behavior of the stability properties of the periodic orbits . the ensemble results on the other hand , even when converged , show statistical roughness . ( b ) the ensemble results are consistent with barbi and salerno in general , although disagreeing in several details . for instance , ( c ) the bifurcation at @xmath74 has a much gentler impact on the ensemble current , which has been growing for a while , while the single  trajectory result changes abruptly . note , ( d ) the very interesting fact that the single  trajectory current completely misses the bifurcation  associated spike at @xmath75 . further , ( e ) the barbi and salerno discussion of the behavior of the current in the range @xmath76 is seen to be flawed our results are consistent with theirs , however , the current changes are seen to be consistent with bifurcations despite their statements to the contrary . on the other hand ( f ) , the ensemble current shows a case [ in fig . ( [ current  bifur2 ] ) , at @xmath77 of current reversal that does not seem to be associated with bifurcations . in this spike , the current abruptly drops below @xmath53 and then rises above it again . the single trajectory current completely ignores this particular effect , as can be seen . the bifurcation diagram indicates that in this case the important transitions happen either before or after the spike . all of this adds up to two statements : the first is a reiteration of the fact that there is significant information in the ensemble current that can not be obtained from the single  trajectory current . the second is that the heuristic that arises from this is again a negative conclusion , that * ( ii ) not all current reversals are associated with bifurcations . * where does this leave us in the search for ` positive ' results , that is , useful heuristics ? one possible way of retaining the mateos conjecture is to weaken it , i.e. make it into the statement that * ( iii ) _ most _ current reversals are associated with bifurcations . * same as fig . ( [ current  bifur1 ] ) except for the range of @xmath0 considered.,title="fig:",width=302 ] for @xmath78 and @xmath8 , plotted are current ( upper ) and bifurcation diagram ( lower ) versus @xmath0 with @xmath62 . note in particular in this figure that eyeball tests can be misleading . we see reversals without bifurcations in ( a ) whereas the zoomed version ( c ) shows that there are windows of periodic and chaotic regimes . this is further evidence that jumps in the current correspond in general to bifurcation.,title="fig:",width=302 ] for @xmath7 and @xmath79 , current ( upper ) and bifurcation diagram ( lower ) versus @xmath0.,title="fig:",width=302 ] however , a * different * rule of thumb , previously not proposed , emerges from our studies . this generalizes mateos conjecture to say that * ( iv ) bifurcations correspond to sudden current changes ( spikes or jumps)*. note that this means these changes in current are not necessarily reversals of direction . if this current jump or spike goes through zero , this coincides with a current reversal , making the mateos conjecture a special case . the physical basis of this argument is the fact that ensembles of particles in chaotic systems _ can _ have net directed transport but the details of this behavior depends relatively sensitively on the system parameters . this parameter dependence is greatly exaggerated at the bifurcation point , when the dynamics of the underlying single  particle system undergoes a transition a period  doubling transition , for example , or one from chaos to regular behavior . scanning the relevant figures , we see that this is a very useful rule of thumb . for example , it completely captures the behaviour of fig . ( [ figure6 ] ) which can not be understood as either an example of the mateos conjecture , or even a failure thereof . as such , this rule significantly enhances our ability to characterize changes in the behavior of the current as a function of parameter . a further example of where this modified conjecture helps us is in looking at a seeming negation of the mateos conjecture , that is , an example where we seem to see current  reversal without bifurcation , visible in fig . ( [ hidden  bifur ] ) . the current  reversals in that scan of parameter space seem to happen inside the chaotic regime and seemingly independent of bifurcation . however , this turns out to be a ` hidden ' bifurcation when we zoom in on the chaotic regime , we see hidden periodic windows . this is therefore consistent with our statement that sudden current changes are associated with bifurcations . each of the transitions from periodic behavior to chaos and back provides opportunities for the current to spike . however , in not all such cases can these hidden bifurcations be found . we can see an example of this in fig . ( [ rev  nobifur ] ) . the current is seen to move smoothly across @xmath80 with seemingly no corresponding bifurcations , even when we do a careful zoom on the data , as in fig . ( [ hidden  bifur ] ) . however , arguably , although subjective , this change is close to the bifurcation point . this result , that there are situations where the heuristics simply do not seem to apply , are part of the open questions associated with this problem , of course . we note , however , that we have seen that these broad arguments hold when we vary other parameters as well ( figures not shown here ) . in conclusion , in this paper we have taken the approach that it is useful to find general rules of thumb ( even if not universally valid ) to understand the complicated behavior of non  equilibrium nonlinear statistical mechanical systems . in the case of chaotic deterministic ratchets , we have shown that it is important to factor out issues of size , location , spread , and transience in computing the ` current ' due to an ensemble before we search for such rules , and that the dependence on ensemble characteristics is most critical near certain bifurcation points . we have then argued that the following heuristic characteristics hold : bifurcations in single  trajectory behavior often corresponds to sudden spikes or jumps in the current for an ensemble in the same system . current reversals are a special case of this . however , not all spikes or jumps correspond to a bifurcation , nor vice versa . the open question is clearly to figure out if the reason for when these rules are violated or are valid can be made more concrete . a.k . gratefully acknowledges t. barsch and kamal p. singh for stimulating discussions , the reimar lst grant and financial support from the alexander von humboldt foundation in bonn . a.k.p . is grateful to carleton college for the ` sit , wallin , and class of 1949 ' sabbatical fellowships , and to the mpipks for hosting him for a sabbatical visit , which led to this collaboration . useful discussions with j . m . rost on preliminary results are also acknowledged . p. hnggi and bartussek , in nonlinear physics of complex systems , lecture notes in physics vol . 476 , edited by j. parisi , s.c . mueller , and w. zimmermann ( springer verlag , berlin , 1996 ) , pp.294  308 ; r.d . asturmian , science * 276 * , 917 ( 1997 ) ; f. jlicher , a. ajdari , and j. prost , rev . mod . phys . * 69 * , 1269 ( 1997 ) ; c. dring , nuovo cimento d*17 * , 685 ( 1995 ) s. flach , o. yevtushenko , and y. zolotaryuk , phys . rev . lett . * 84 * , 2358 ( 2000 ) ; o. yevtushenko , s. flach , y. zolotaryuk , and a. a. ovchinnikov , europhys . lett . * 54 * , 141 ( 2001 ) ; s. denisov et al . e * 66 * , 041104 ( 2002 )"  "in 84 , 258 ( 2000 ) , mateos conjectured that current reversal in a classical deterministic ratchet is associated with bifurcations from chaotic to periodic regimes .
this is based on the comparison of the current and the bifurcation diagram as a function of a given parameter for a periodic asymmetric potential .
barbi and salerno , in 62 , 1988 ( 2000 ) , have further investigated this claim and argue that , contrary to mateos claim , current reversals can occur also in the absence of bifurcations .
barbi and salerno s studies are based on the dynamics of one particle rather than the statistical mechanics of an ensemble of particles moving in the chaotic system .
the behavior of ensembles can be quite different , depending upon their characteristics , which leaves their results open to question . in this paper we present results from studies showing how the current depends on the details of the ensemble
used to generate it , as well as conditions for convergent behavior ( that is , independent of the details of the ensemble ) .
we are then able to present the converged current as a function of parameters , in the same system as mateos as well as barbi and salerno .
we show evidence for current reversal without bifurcation , as well as bifurcation without current reversal .
we conjecture that it is appropriate to correlate abrupt changes in the current with bifurcation , rather than current reversals , and show numerical evidence for our claims ." 
"studies of laser beams propagating through turbulent atmospheres are important for many applications such as remote sensing , tracking , and long  distance optical communications . howerver , fully coherent laser beams are very sensitive to fluctuations of the atmospheric refractive index . the initially coherent laser beam acquires some properties of gaussian statistics in course of its propagation through the turbulence . as a result , the noise / signal ratio approaches unity for long  distance propagation . ( see , for example , refs.@xcite@xcite ) . this unfavourable effect limits the performance of communication channels . to mitigate this negative effect the use of partially ( spatially ) coherent beams was proposed . the coherent laser beam can be transformed into a partially coherent beam by means of a phase diffuser placed near the exit aperture . this diffuser introduces an additional phase ( randomly varying in space and time ) to the wave front of the outgoing radiation . statistical characteristics of the random phase determine the initial transverse coherence length of the beam . it is shown in refs . @xcite,@xcite that a considerable decrease in the noise / signal ratio can occur under following conditions : ( i ) the ratio of the initial transverse coherence length , @xmath0 , to the beam radius , @xmath1 , should be essentially smaller than unity ; and ( ii ) the characteristic time of phase variations , @xmath2 , should be much smaller than the integration time , @xmath3 , of the detector . however , only limiting cases @xmath4 and @xmath5 have been considered in the literature . ( see , for example , refs . @xcite,@xcite and ref . @xcite , respectively ) . it is evident that the inequality @xmath6 can be easily satisfied by choosing a detector with very long integration time . at the same time , this kind of the detector can not distinguish different signals within the interval @xmath3 . this means that the resolution of the receiving system might become too low for the case of large @xmath3 . on the other hand , there is a technical restriction on phase diffusers : up to now their characteristic times , @xmath2 , are not smaller than @xmath7 . besides that , in some specific cases ( see , for example , ref . @xcite ) , the spectral broadening of laser radiation due to the phase diffuser ( @xmath8 ) may become unacceptably high . the factors mentioned above impose serious restrictions on the physical characteristics of phase diffusers which could be potentially useful for suppressing the intensity fluctuations . an adequate choice of diffusers may be facilitated if we know in detail the effect of finite  time phase variation , introduced by them , on the photon statistics . in this case , it is possible to control the performance of communication systems . in what follows , we will obtain theoretically the dependence of scintillation index on @xmath9 without any restrictions on the value of this ratio this is the main purpose of our paper . further analysis is based on the formalism developed in ref . @xcite and modified here to understand the case of finite  time dynamics of the phase diffuser . the detectors of the absorbed type do not sense the instantaneous intensity of electromagnetic waves @xmath10 . they sense the intensity averaged over some finite interval @xmath3 i.e. @xmath11 usually , the averaging time @xmath3 ( the integration time of the detector ) is much smaller than the characteristic time of the turbulence variation , @xmath12 , ( @xmath13 ) . therefore , the average value of the intensity can be obtained by further averaging of eq . [ one ] over many measurements corresponding various realizations of the refractive  index configurations . the scintillation index determining the mean  square fluctuations of the intensity is defined by @xmath14\bigg /\big < \bar{i}\big > ^2= \frac{\big < : \bar i(t ) ^2:\big>}{\big<\bar i \big>^2}1,\ ] ] where the symbol @xmath15 indicates the normal ordering of the creation and annihilation operators which determine the intensity , @xmath10 . ( see more details in refs . @xcite,@xcite ) . the brackets @xmath16 indicate quantum  mechanical and atmospheric averagings . the intensity @xmath17 depends not only on @xmath18 , but also on the spatial variable @xmath19 . therefore , the detected intensity is the intensity @xmath20 averaged not only over @xmath18 as in eq . [ one ] , but also over the detector aperture . for simplicity , we will restrict ourselves to calculations of the intensity correlations for coinciding spatial points that correspond to `` small '' detector aperture . this simplification is quite reasonable for a long  distance propagation path of the beam . in the case of quasimonochromatic light , we can choose @xmath20 in the form @xmath21 where @xmath22 and @xmath23 are the creation and annihilation operators of photons with momentum @xmath24 . they are given in the heisenberg representation . @xmath25 is the volume of the system . it follows from eqs . [ two],[three ] that @xmath26 can be obtained if one knows the average @xmath27 it is a complex problem to obtain this value for arbitrary turbulence strengths and propagation distances . nevertheless , the following qualitative reasoning can help to do this in the case of strong turbulence . we have mentioned that the laser light acquires the properties of gaussian statistics in the course of its propagation through the turbulent atmosphere . as a result , in the limit of infinitely long propagation path , @xmath28 , only diagonal " terms , i.e. terms with ( i ) @xmath29 or ( ii ) @xmath30 , @xmath31 contribute to the right part of eq . [ four ] . for large but still finite @xmath28 , there exist small ranges of @xmath32 in case ( i ) and @xmath33 , @xmath34 in case ( ii ) contributing into the sum in eq . the presence of the mentioned regions is due to the two possible ways of correlating of four different waves ( see ref . @xcite ) which enter the right hand side of eq . [ four ] . as explained in ref . @xcite , the characteristic sizes of regions ( i ) and ( ii ) depend on the atmospheric broadening of beam radii as @xmath35 , thus decreasing with increasing @xmath28 . in the case of long  distance propagation , @xmath36 is much smaller than the component of photon wave  vectors perpendicular to the @xmath28 axis . the last quantity grows with @xmath28 as @xmath37 . ( see ref . @xcite ) . for this reason , the overlapping of regions ( i ) and ( ii ) can be neglected . in this case eq . [ four ] can be rewritten in the convenient form : @xmath38 @xmath39 where the value @xmath40 , confining summation over @xmath41 , is chosen to be greater than @xmath42 but much smaller than the characteristic transverse wave vector of the photons ; this is consistent with the above explanations . the two terms in the right  hand side correspond to the two regions of four  wave correlations . the quantity @xmath43 entering the right side of eq . [ five ] is the operator of photon density in phase space ( the photon distribution function in @xmath44 space ) . it was used in refs . @xcite,@xcite and @xcite for the description of photon propagation in turbulent atmospheres . by analogy , we can define the two  time distribution function @xmath45 then eq . [ five ] can be rewritten in terms of the distribution functions as @xmath46 let us represent @xmath47 in the form @xmath48 . we assume that @xmath49 , as explained in the text after eq.[one ] . in this case the hamiltonian of photons in a turbulent atmosphere can be considered to be independent of time . as a result , both functions defined by eqs . [ six ] and [ seven ] satisfy the same kinetic equation , i.e. @xmath50 @xmath51 where @xmath52 is the photon velocity , @xmath53 is a random force , caused by the turbulence . this force is equal to @xmath54 , where @xmath55 is the frequency of laser radiation . @xmath56 is the refractive index of the atmosphere . the general solution of the equation for @xmath48 can be written in the form @xmath57 where @xmath58 @xmath59 the functions @xmath60 and @xmath61 obey the equations of motion @xmath62 with the boundary conditions @xmath63 . the instant @xmath64 is equal to @xmath65 , where @xmath66 is the speed of light . @xmath64 is the time of the exit of photons from the source . this choice of @xmath64 makes it possible to neglect the influence of the turbulence on the initial values of operators @xmath67 ( their dependence on time is as in vacuum ) . the term for @xmath68 can be obtained from eq . [ twelve ] by putting @xmath69 . substituting both distribution functions into eq . [ eight ] , we obtain @xmath70 @xmath71 @xmath72:\big>,\ ] ] where @xmath73 and @xmath74 are solutions of eqs . [ twelve ] with the initial conditions @xmath63 and @xmath75 , respectively . the operators on the right side of eq . [ thirteen ] are related through matching conditions with the amplitudes of the exiting laser radiation ( see ref . @xcite ) by the relation @xmath76 where @xmath77 is the operator of the laser field which is assumed to be a single  mode field and the subscript ( @xmath78 ) means perpendicular to the @xmath28axis component . the function @xmath79 describes the profile of the laser mode , which is assumed to be gaussian  type function [ @xmath80 . @xmath1 desribes the initial radius of the beam . to account for the effect of the phase diffuser , a factor @xmath81 or @xmath82 should be inserted into the integrand of eq . [ fourteen ] . the quantity @xmath83 is the random phase introduced by the phase diffuser . a similar consideration is applicable to each of four photon operators entering both terms in square brackets of eq . [ thirteen ] . it can be easily seen that the factor @xmath84},\ ] ] describing the effect of phase screen on the beam , enters implicitly the integrand of eq . [ thirteen ] ( the indices @xmath78 are omitted here for the sake of brevity ) . there are integrations over variables @xmath85 as shown in eq . [ fourteen ] . furthermore , the brackets @xmath16 , which indicate averaging over different realizations of the atmosperic inhomogeneities , also indicate averaging over different states of the phase diffuser . as long as both types of averaging do not correlate , the factor ( [ fifteen ] ) entering eq . [ thirteen ] must be averaged over different instants , @xmath64 . to begin with , let us consider the simplest case of two phase correlations @xmath86}\big > .\ ] ] it is evident that in the case @xmath87 , as shown schematically in fig . 1 , the factor ( [ sixteen ] ) is sizable if only points @xmath19 and @xmath88 are close to one another . two curves correspond to different instants @xmath18 and @xmath89 . ] therefore , the term given by eq . [ sixteen ] can be replaced by @xmath90 where @xmath91 is considered to be a gaussian random variable with the mean  square values given by @xmath92 ^ 2\rangle = \langle [ \frac { \partial \varphi ( { \bf r},t_0)}{\partial y}]^2\rangle = 2\lambda _ c^{2}$ ] , where @xmath93 is the correlation length of phase fluctuations . ( see fig.1 ) . as we see , in this case the effect of phase fluctuations can be described by the schell model @xcite@xcite,@xcite@xcite . a somewhat more complex situation is for the average value of @xmath94 given by eq . [ fifteen ] . there is an effective phase correlation not only in the case of coincident times , but also for differing times . for @xmath95 , two different sets of coordinates contribute considerably to phase correlations . this can be described mathematically as @xmath96}\big > \approx \big < e^{i[\varphi ( { \bf r},t_0)\varphi ( { \bf r^\prime},t_0)]}\big > \times\ ] ] @xmath97}\big > + \big < e^{i[\varphi ( { \bf r},t_0)\varphi ( { \bf r^\prime _ 1},t_0+\tau ) ] } \big > \big < e^{i[\varphi ( { \bf r_1},t_0+\tau ) \varphi ( { \bf r^\prime } , t_0)]}\big > .\ ] ] repeating the arguments leading to eq . [ seventeen ] , we represent the difference in the last term @xmath98 as @xmath99 then , considering the random functions @xmath100 and @xmath101 as independent gaussian variables , we obtain a simple expression for @xmath102 . it is given by @xmath103}+ e^{\lambda _ c^{2}[({\bf r  r^\prime}_1)^2+({\bf r^\prime r_1})^2]2\nu^2\tau ^2},\ ] ] where @xmath104 ^ 2\rangle = 2\nu^2 $ ] . as we see , the effect of the phase screen can be described by two parameters , @xmath93 and @xmath105 , which characterize the spatial and temporal coherence of the laser beam . in the limiting case , @xmath106 , the second term in eq . [ twenty ] vanishes and the problem is reduced to the case considered in refs . @xcite,@xcite . in the opposite case , @xmath107 , both terms in eq . [ twenty ] are important . this is shown in ref . @xcite . in what follows , we will see that these two limiting cases have physical interpretations where where @xmath108 ( slow detector ) and @xmath109 ( fast detector ) , respectively . there is a specific realization of the diffuser in which a random phase distribution moves across the beam . ( this situation can be modeled by a rotating transparent disk with large diameter and varying thickness . ) the phase depends here on the only variable @xmath110 , i.e. @xmath111 where @xmath112 is the velocity of the drift . then we have @xmath113}+e^{\lambda _ c^{2}[({\bf r  r^\prime_1+v}\tau)^2+({\bf r^\prime r_1+v}\tau)^2]}.\ ] ] comparing eqs . [ twenty ] and [ twtw ] , we see that the quantity , @xmath114 , stands for the characteristic parameter describing the efficiency of the phase diffuser . the criterion of slow " detector requires @xmath115 . qualitatively , the two scenarios of phase variations , given by eqs . [ twenty ] and [ twtw ] , affect in a similar way the intensity fluctuations . in what follows , we consider the first of them as the simplest one . ( this is because the spatial and temporal variables in @xmath102 , given by eq . [ twenty ] , are separable . ) _ vs _ propagation distance @xmath28 in the case of `` slow '' detector : @xmath116 . the parameter @xmath117 indicates different initial coherence length . in the absence of phase diffuser @xmath118 ( solid line ) . @xmath119 is the conventional parameter describing a strength of the atmospheric turbulence . ] substituting the expressions for operators given by eq . [ fourteen ] with account for the phase factors @xmath120 and averaging over time as shown in eq . [ one ] , we obtain @xmath121 @xmath122\bigg > , \ ] ] where the notation @xmath16 after sums indicates averaging over different realizations of the atmospheric refractive index . the parameter @xmath123 describes the initial coherence length modified by the phase diffuser . other notations are defined by following relations @xmath124 @xmath125 @xmath126 further calculations follow the scheme described in ref @xcite . 2 illustrates the effect of the phase diffuser on scintillations in the limit of a slow " detector ( @xmath127 ) . we can see a considerable decrease in @xmath128 caused by the phase diffuser . at the same time , the effect of the phase screen on @xmath128 becomes weaker for finite values of @xmath129 . moreover , comparing the two upper curves in fig . 3 , we see the opposite effect : slow phase variations ( @xmath130 ) result in increased scintillations . there is a simple explanation for this phenomenon : the noise generated by the turbulence is complemented by the noise arising from the random phase screen . the integration time of the detector , @xmath3 , is not sufficiently large for averaging phase variations generated by the diffuser . the function , @xmath131 , has a very simple form in the two limits : ( i)@xmath132 , when @xmath133 ; and ( ii ) @xmath134 , when @xmath109 . then , in case ( i ) and for small values of the initial coherence [ @xmath135 , the asymptotic term for the scintillation index ( @xmath136 ) is given by @xmath137 the right  hand side of eq . [ twfo ] differs from analogous one in ref . @xcite by the value @xmath138 that is much less than unity but , nevertheless , can be comparable or even greater than @xmath139 . in case ( ii ) , the asymptotic value of @xmath26 is close to unity , coinciding with the results of refs . @xcite and @xcite . this agrees with well known behavior of the scintillation index to approach unity for any source distribution , provided the response time of the recording instrument is short compared with the source coherence time . ( see , for example , survey @xcite ) . a similar tendency can be seen in both figs . 3 and 4 : the curves with the smallest @xmath129 , used for numerical calculations ( @xmath130 ) , are close to the curves without diffuser " in spite of the small initial coherence length [ @xmath140 . it can also be seen that all curves approach their asymptotic values very slowly . describing diffuser dynamics . the solid curve is calculated for @xmath118 ( without diffuser ) . other curves are for @xmath141 . ] . ] it follows from our analysis that the scintillation index is very sensitive to the diffuser parameters , @xmath0 and @xmath142 , for long propagation paths . on the other hand , the characteristics of the irradience such as beam radius , @xmath143 , and angle  of  arrival spread , @xmath144 , do not depend on the presence of the phase diffuser for large values of @xmath28 . to see this , the following analysis is useful . the beam radius expressed in terms of the distribution function is given by @xmath145 straightforward calculations using eq . [ ten ] with @xmath69 ( see ref . @xcite ) result in the following explicit form : @xmath146 where @xmath147 and @xmath148 is the inner radius of turbulent eddies , which in our previous calculations was assumed to be equal @xmath149 m . as we see , the third term does not depend on the diffuser parameters and it dominates when @xmath150 . a similar situation holds for the angle  of  arrival spread , @xmath144 . ( this physical quantity is of great importance for the performance of communication systems based on frequency encoded information @xcite . ) it is defined by the distribution function as @xmath151 simple calculations , which are very similar to those while obtaining @xmath152 , result in @xmath153 ^ 2=\frac 2{r_1 ^ 2q_0 ^ 2}+12tz\frac { 4z^2}{q_0 ^ 4r^2}(r_1^{2}+3tq_0 ^ 2z)^2.\ ] ] for long propagation paths , . [ twei ] reduces to @xmath154 , which like @xmath152 does not depend on the diffuser parameters . as we see , for large distances @xmath28 , the quantities @xmath152 and @xmath144 do not depend on @xmath93 and @xmath105 . this contrasts with the case of the scintillation index . so pronounced differences can be explained by differences in the physical nature of these characteristics . it follows from eq . [ two ] that the functional , @xmath26 , is quadratic in the distribution function , @xmath155 . hence , four  wave correlations determine the value of scintillation index . the main effect of a phase diffuser on @xmath26 is to destroy correlations between waves exited at different times . ( see more explanations in ref . this is achieved at sufficiently small parameters @xmath93 and @xmath156 . in contrast , @xmath152 and @xmath144 depend on two wave  correlations , both waves being given at the same instant . therefore , the values of @xmath152 and @xmath144 do not depend on the rate of phase variations [ @xmath105 does not enter the factor ( [ seventeen ] ) describing the effect of phase diffuser ] . moreover , these quantities become independent of @xmath93 at long propagation paths because light scattering on atmospheric inhomogeneities prevails in this case . the plots in figs . 3 anf 4 show that the finite  time effect is quite sizable even for very slow " detectors ( @xmath157 ) . our paper makes it possible to estimate the actual utility of phase diffusers in several physical regimes . we have analyzed the effects of a diffuser on scintillations for the case of large  amplitude phase fluctuations . this specific case is very convenient for theoretial analysis because only two parameters are required to describe the effects of the diffuser . phase fluctuations may occur independently in space as well as in time . also , our formalism can be applied for the physical situation in which a spatially random phase distribution drifts across the beam . [ twtw ] . ) our results show the importance of both parameters , @xmath93 and @xmath142 , on the ability of a phase diffuser to suppress scintillations . this work was carried out under the auspices of the national nuclear security administration of the u.s . department of energy at los alamos national laboratory under contract no . de  ac52  06na25396 . we thank onr for supporting this research ."  "the effect of a random phase diffuser on fluctuations of laser light ( scintillations ) is studied .
not only spatial but also temporal phase variations introduced by the phase diffuser are analyzed .
the explicit dependence of the scintillation index on finite  time phase variations is obtained for long propagation paths .
it is shown that for large amplitudes of phase fluctuations , a finite  time effect decreases the ability of phase diffuser to suppress the scintillations ." 
"the so  called `` nucleon spin crisis '' raised by the european muon collaboration ( emc ) measurement in 1988 is one of the most outstanding findings in the field of hadron physics @xcite,@xcite . the renaissance of the physics of high energy deep inelastic scatterings is greatly indebted to this epoch  making finding . probably , one of the most outstanding progresses achieved recently in this field of physics is the discovery and the subsequent research of completely new observables called generalized parton distribution functions ( gpd ) . it has been revealed that the gpds , which can be measured through the so  called deeply  virtual compton scatterings ( dvcs ) or the deeply  virtual meson productions ( dvmp ) , contain surprisingly richer information than the standard parton distribution functions @xcite@xcite . roughly speaking , the gpds are generalization of ordinary parton distributions and the elastic form factors of the nucleon . the gpds in the most general form are functions of three kinematical variables : the average longitudinal momentum fraction @xmath1 of the struck parton in the initial and final states , a skewdness parameter @xmath3 which measures the difference between two momentum fractions , and the four  momentum  transfer square @xmath4 of the initial and final nucleon . in the forward limit @xmath5 , some of the gpds reduce to the usual quark , antiquark and gluon distributions . on the other hand , taking the @xmath0th moment of the gpds with respect to the variable @xmath1 , one obtains the generalizations of the electromagnetic form factors of the nucleon , which are called the generalized form factors of the nucleon . the complex nature of the gpds , i.e. the fact that they are functions of three variable , makes it quite difficult to grasp their full characteristics both experimentally and theoretically . from the theoretical viewpoint , it may be practical to begin studies with the two limiting cases . the one is the forward limit of zero momentum transfer . we have mentioned that , in this limit , some of the gpds reduce to the ordinary parton distribution function depending on one variable @xmath1 . however , it turns out that , even in this limit , there appear some completely new distribution functions , which can not be accessed by the ordinary inclusive deep  inelastic scattering measurements . very interestingly , it was shown by ji that one of such distributions contains valuable information on the total angular momentum carried by the quark fields in the nucleon @xcite@xcite . this information , combined with the known knowledge on the longitudinal quark polarization , makes it possible to determine the quark orbital angular momentum contribution to the total nucleon spin purely experimentally . another relatively  easy  to  handle quantities are the generalized form factors of the nucleon @xcite,@xcite , which are given as the non  forward nucleon matrix elements of the spin@xmath0 , twist  two quark and gluon operators . since these latter quantities are given as the nucleon matrix elements of local operators , they can be objects of lattice qcd simulations . ( it should be compared with parton distributions . the direct calculation of parton distributions is beyond the scope of lattice qcd simulations , since it needs to treat the nucleon matrix elements of quark bilinears , which are _ nonlocal in time_. ) in fact , two groups , the lhpc collaboration and the qcdsf collaboration independently investigated the generalized form factors of the nucleon , and gave several interesting predictions , which can in principle be tested by the measurement of gpds in the near future @xcite@xcite . although interesting , there is no _ a priori _ reason to believe that the predictions of these lattice simulations are realistic enough . the reason is mainly that the above mentioned lattice simulation were carried out in the heavy pion regime around @xmath6 with neglect of the so  called disconnected diagrams . our real world is rather close to the chiral limit with vanishing pion mass , and we know that , in this limit , the goldstone pion plays very important roles in some intrinsic properties of the nucleon . the lattice simulation carried out in the heavy pion region is in danger of missing some important role of chiral dynamics . on the other hand , the chiral quark soliton model ( cqsm ) is an effective model of baryons , which maximally incorporates the chiral symmetry of qcd and its spontaneous breakdown @xcite,@xcite . ( see @xcite@xcite for early reviews . ) it was already applied to the physics of ordinary parton distribution functions with remarkable success @xcite@xcite . for instance , an indispensable role of pion  like quark  antiquark correlation was shown to be essential to understand the famous nmc measurement , which revealed the dominance of the @xmath7quark over the @xmath8quark inside the proton @xcite,@xcite,@xcite . then , it would be interesting to see what predictions the cqsm would give for the quantities mentioned above . now , the main purpose of the present study is to study the generalized form factors of the nucleon within the framework of the cqsm and compare its predictions with those of the lattice qcd simulations . of our particular interest here is to see the change of final theoretical predictions against the variation of the pion mass . such an analysis is expected to give some hints for judging the reliability of the lattice qcd predictions at the present level for the observables in question . the plan of the paper is as follows . in sect.ii , we shall briefly explain how to introduce the nonzero pion mass into the scheme of the cqsm with pauli  villars regularization . in sect.iii , we derive the theoretical expressions for the generalized form factors of the nucleon . sect.iv is devoted to the discussion of the results of the numerical calculations . some concluding remarks are then given in sect.v . we start with the basic effective lagrangian of the chiral quark soliton model in the chiral limit given as @xmath9 with @xmath10 which describes the effective quark fields , with a dynamically generated mass @xmath11 , strongly interacting with pions @xcite,@xcite . since one of the main purposes of the present study is to see how the relevant observables depend on pion mass , we add to @xmath12 an explicit chiral symmetry breaking term @xmath13 given by @xcite @xmath14 . \label{eq : lsb}\ ] ] here the trace in ( [ eq : lsb ] ) is to be taken with respect to flavor indices . the total model lagrangian is therefore given by @xmath15 naturally , one could have taken an alternative choice in which the explicit chiral  symmetry  breaking effect is introduced in the form of current quark mass term as @xmath16 . we did not do so , because we do not know any consistent regularization of such effective lagrangian with finite current quark mass within the framework of the pauli  villars subtraction scheme , as explained in appendix of @xcite . the effective action corresponding to the above lagrangian is given as @xmath17 = s_f [ u ] + s_m [ u ] , \label{eq : energsol}\ ] ] with @xmath18 =  \,i \,n_c \,\mbox{sp } \ , \ln ( i \not\!\partial  m u^{\gamma_5 } ) , \ ] ] and @xmath19 = \int \frac{1}{4 } \,f_{\pi}^2 \,m_{\pi}^2 \ , \mbox{tr}_f \ , [ u ( x ) + u^{\dagger } ( x )  2 ] \,d^4 x .\ ] ] here @xmath20 with @xmath21 and @xmath22 representing the trace of the dirac gamma matrices and the flavors ( isospins ) , respectively . the fermion ( quark ) part of the above action contains ultra  violet divergences . to remove these divergences , we must introduce physical cutoffs . for the purpose of regularization , here we use the pauli  villars subtraction scheme . as explained in @xcite , we must eliminate not only the logarithmic divergence contained in @xmath23 $ ] but also the quadratic and logarithmic divergence contained in the equation of motion shown below . to get rid of all these troublesome divergence , we need at least two subtraction terms . the regularized action is thus defined as @xmath24 = s_f^{reg } [ u ] + s_m [ u ] , \ ] ] where @xmath25 = s_f [ u ]  \sum_{i = 1}^2 \,c_i \,s_f^{\lambda _ i } [ u ] .\ ] ] here @xmath26 is obtained from @xmath27 $ ] with @xmath11 replaced by the pauli  villars regulator mass @xmath28 . these parameters are fixed as follows . first , the quadratic and logarithmic divergence contained in the equation of motion ( or in the expression of the vacuum quark condensate ) can , respectively , removed if the subtraction constants satisfy the following two conditions : @xmath29 ( we recall that the condition which removes the logarithmic divergence in @xmath23 $ ] just coincides with the 1st of the above conditions . ) by solving the above equations for @xmath30 and @xmath31 , we obtain @xmath32 which constrains the values of @xmath30 and @xmath31 , once @xmath33 and @xmath34 are given . for determining @xmath33 and @xmath34 , we use two conditions @xmath35 and @xmath36 which amounts to reproducing the correct normalization of the pion kinetic term in the effective meson lagrangian and also the empirical value of the vacuum quark condensate . to derive soliton equation of motion , we must first write down a regularized expression for the static soliton energy . under the hedgehog ansatz @xmath37 for the background pion fields , it is obtained in the form : @xmath38 = e_f^{reg } [ f ( r ) ] + e_m [ f ( r ) ] , \ ] ] where the meson ( pion ) part is given by @xmath39 =  \,f_{\pi}^2 \,m_{\pi}^2 \int d^3 x \,[\cos f ( r )  1 ] , \ ] ] while the fermion ( quark ) part is given as @xmath40 = e_{val } + e_{v p}^{reg } , \label{eq : estatic}\ ] ] with @xmath41 here @xmath42 are the quark single  particle energies , given as the eigenvalues of the static dirac hamiltonian in the background pion fields : @xmath43 with @xmath44 , \label{eq : dirach}\ ] ] while @xmath45 denote energy eigenvalues of the vacuum hamiltonian given by ( [ eq : dirach ] ) with @xmath46 ( or @xmath47 ) . the particular state @xmath48 , which is a discrete bound  state orbital coming from the upper dirac continuum under the influence of the hedgehog mean field , is called the valence level . the symbol @xmath49 in ( [ eq : regenerg ] ) denotes the summation over all the negative energy eigenstates of @xmath50 , i.e. the negative energy dirac continuum . the soliton equation of motion is obtained from the stationary condition of @xmath51 $ ] with respect to the variation of the profile function @xmath52 : @xmath53 \nonumber \\ & = & 4 \pi r^2 \,\left\ {  \,m \,[s ( r ) \sin f ( r )  p ( r ) \cos f ( r ) ] + f_{\pi}^2 m_{\pi}^2 \sin f ( r ) \right\ } , \end{aligned}\ ] ] which gives @xmath54 here @xmath55 and @xmath56 are regularized scalar and pseudoscalar quark densities given as @xmath57 with @xmath58 while @xmath59 and @xmath60 are the corresponding densities evaluated with the regulator mass @xmath28 instead of the dynamical quark mass @xmath11 . we also note that @xmath61 and @xmath62 . as usual , a self  consistent soliton solution is obtained as follows with use of kahana and ripka s discretized momentum basis @xcite,@xcite . first by assuming an appropriate ( though arbitrary ) soliton profile @xmath52 , the eigenvalue problem of the dirac hamiltonian is solved . using the resultant eigenfunctions and their associated eigenenergies , one can calculate the regularized scalar and pseudoscalar densities @xmath55 and @xmath56 . with use of these @xmath63 and @xmath64 , eq.([eq : profile ] ) can then be used to obtain a new soliton profile @xmath52 . the whole procedure above is repeated with this new profile @xmath52 until the self  consistency is attained . since the generalized form factors of the nucleon are given as moments of generalized parton distributions ( gpds ) , it is convenient to start with the theoretical expressions of the unpolarized gpds @xmath65 and @xmath66 within the cqsm . following the notation in @xcite,@xcite , we introduce the quantities @xmath67 and @xmath68 here , the isoscalar and isovector combinations respectively correspond to the sum and the difference of the quark flavors @xmath69 and @xmath70 . the relation between these quantities and the generalized parton distribution functions @xmath71 and @xmath72 are obtained most conveniently in the so  called breit frame . they are given by @xmath73 where @xmath74 these two independent combinations of @xmath65 and @xmath66 can be extracted through the spin projection of @xmath75 as @xmath76 where tr " denotes the trace over spin indices , while @xmath77 . now , within the cqsm , it is possible to evaluate the right  hand side ( rhs ) of ( [ eq : hetrace ] ) and ( [ eq : emtrace ] ) . since the answers are already given in several previous papers @xcite@xcite , we do not repeat the derivation . here we comment only on the following general structure of the theoretical expressions for relevant observables in the cqsm . the leading contribution just corresponds to the mean field prediction , which is independent of the collective rotational velocity @xmath78 of the hedgehog soliton . the next  to  leading order term takes account of the linear response of the internal quark motion to the rotational motion as an external perturbation , and consequently it is proportional to @xmath78 . it is known that the leading  order term contributes to the isoscalar combination of @xmath79 and to the isovector combination of @xmath80 , while the isoscalar part of @xmath79 and the isovector part of @xmath80 survived only at the next  to  leading order of @xmath78 ( or of @xmath81 ) . the leading  order gpds are then given as @xmath82 here the symbol @xmath83 denotes the summation over the occupied ( the valence plus negative  energy dirac sea ) quark orbitals in the hedgehog mean field . on the other hand , the theoretical expressions for the isovector part of @xmath84 and the isoscalar part of @xmath85 , which survive at the next  to  leading order , are a little more complicated . they are given as double sums over the single quark orbitals as @xmath86 \ , e^{i x m_n z^0 } \nonumber \\ & \ , & \hspace{30 mm } \times \ \langle n  \tau^a  m \rangle \langle m  \,\tau^a \,(1 + \gamma^0 \gamma^3 ) \ , e^{i ( z^0/2 ) \hat{p}_3 } \ , e^{i { \mbox{\boldmath$\delta$ } } \cdot { \mbox{\boldmath$x$ } } } \,e^{i ( z^0/2 ) \hat{p}_3 }  n \rangle . \label{eq : gpdheiv}\end{aligned}\ ] ] and @xmath87 e^{i x m_n z^0 } \nonumber \\ & \ , & \hspace{25 mm } \times \ \langle n  \tau^b  m \rangle \langle m  \,(1 + \gamma^0 \gamma^3 ) \,e^{i ( z^0/2 ) \hat{p}_3 } \ , \frac{\varepsilon^{3 a b } \delta^a}{{\mbox{\boldmath$\delta$}}_{\perp}^2 } \ , e^{i { \mbox{\boldmath$\delta$ } } \cdot { \mbox{\boldmath$x$ } } } \,e^{i ( z^0/2 ) \hat{p}_3 }  n \rangle . \label{eq : gpdemiv}\end{aligned}\ ] ] these four expressions for the unpolarized gpds , i.e. ( [ eq : gpdheis ] ) @xmath88 ( [ eq : gpdemiv ] ) , are the basic starting equations for our present study of the generalized form factors of the nucleon within the cqsm . there are infinite tower of generalized form factors , which are defined as the @xmath0th moments of gpds . in the present study , we confine ourselves to the 1st and the 2nd moments , which respectively corresponds to the standard electromagnetic form factors of the nucleon and the so  called gravitational form factors . we are especially interested in the second one , since they are believed to contain valuable information on the spin contents of the nucleon through ji s angular momentum sum rule @xcite,@xcite . for each isospin channel , the 1st and the 2nd moments of @xmath89 define the sachs  electric and gravito  electric form factors as @xmath90 and @xmath91 on the other hand , the 1st and the 2nd moments of @xmath92 respectively define the sachs  magnetic and gravito  magnetic form factors as @xmath93 and @xmath94 in the following , we shall explain how we can calculate the generalized form factors based on the theoretical expressions of corresponding gpds , by taking @xmath95 and @xmath96 as examples . setting @xmath97 and integrating over @xmath98 in ( [ eq : gpdheis ] ) , we obtain @xmath99 putting this expression into ( [ eq : ge10def ] ) , we have @xmath100 it is easy to see that , using the generalized spherical symmetry of the hedgehog configuration , the term containing the factor @xmath101 identically vanishes , so that @xmath102 is reduced to a simple form as follows : @xmath103 aside from the factor @xmath104 , this is nothing but the known expression for the isoscalar sachs  electric form factor of the nucleon in the cqsm @xcite . a less trivial example is @xmath96 , which is defined as the 2nd moment of @xmath105 . inserting ( [ eq : heis ] ) into ( [ eq : ge10def ] ) and carrying out the integration over @xmath1 , we obtain @xmath106 using the partial  wave expansion of @xmath107 , this can be written as @xmath108 this can further be divided into four pieces as @xmath109 where @xmath110 with @xmath111 to proceed further , we first notice that , by using the generalized spherical symmetry , @xmath112 survives only when @xmath113 , i.e. @xmath114 which leads to the result : @xmath115 to evaluate @xmath116 , we first note that @xmath117^{(\lambda ) } .\ ] ] here , the generalized spherical symmetry dictates that @xmath118 must be zero , so that the rhs of the above equation is effectively reduced to @xmath119^{(0 ) } .\ ] ] this then gives @xmath120^{(0 ) } \ , \phi_n ( { \mbox{\boldmath$x$ } } ) .\end{aligned}\ ] ] owing to the identity @xmath121 we therefore find that @xmath122 next we investigate the third term @xmath123 . using @xmath124^{(\lambda ) } \ \sim \  \,\frac{1}{\sqrt{3 } } \,\delta_{l,1 } \,\delta_{m,0 } \ , [ y_1 ( \hat{x } ) \times \hat{{\mbox{\boldmath$p$ } } } ] ^{(0 ) } , \end{aligned}\ ] ] we obtain @xmath125^{(0 ) } \ , \phi_n ( { \mbox{\boldmath$x$ } } ) .\end{aligned}\ ] ] this term vanishes by the same reason as @xmath116 does . the last term @xmath126 is a little more complicated . we first notice that @xmath127^{(\lambda)}_0 \nonumber \\ & \sim & \sum_{\lambda } \ , \langle 1 0 1 0  \lambda 0 \rangle \ , \langle l m \lambda 0  00 \rangle \ , [ y_l ( \hat{x } ) \times [ { \mbox{\boldmath$\alpha$ } } \times \hat{{\mbox{\boldmath$p$ } } } ] ^{(\lambda ) } ] ^{(0 ) } \nonumber \\ & = & \delta_{m , 0 } \,\langle 1 0 1 0  l 0 \rangle \ , \langle l 0 l 0  0 0 \rangle \ , [ y_l ( \hat{x } ) \times [ { \mbox{\boldmath$ \alpha$ } } \times \hat{{\mbox{\boldmath$p$ } } } ] ^{(l ) } ] ^{(0 ) } , \end{aligned}\ ] ] which dictates that @xmath128 must be 0 or 2 . inserting the above expression into ( [ eq : m4 ] ) , and using the explicit values of clebsch  gordan coefficients , @xmath126 becomes @xmath129^{(0 ) } \ , \phi_n ( { \mbox{\boldmath$x$ } } ) \nonumber \\ & + & \frac{\sqrt{4 \pi}}{\sqrt{6 } } \cdot \frac{n_c}{m_n } \int d^3 x \,\sum_{n \leq 0 } \,\phi_n^{\dagger } ( { \mbox{\boldmath$x$ } } ) \ , j_2 ( \delta_{\perp } x ) \ , [ y_2 ( \hat{x } ) \times [ { \mbox{\boldmath$\alpha$ } } \times \hat{{\mbox{\boldmath$p$ } } } ] ^{(2 ) } ] ^{(0 ) } \ , \phi_n ( { \mbox{\boldmath$x$ } } ) .\end{aligned}\ ] ] using the identities @xmath130^{(0 ) } & = & \frac{1}{\sqrt{3 } } { \mbox{\boldmath$\alpha$ } } \cdot \hat{{\mbox{\boldmath$p$ } } } , \\ \ , [ y_2 ( \hat{x } ) \times [ { \mbox{\boldmath$\alpha$ } } \times { \mbox{\boldmath$p$ } } ] ^{(2 ) } ] ^{(0 ) } & = & [ [ y_2 ( \hat{x } ) \times \hat{{\mbox{\boldmath$p$ } } } ] ^{(1 ) } \times { \mbox{\boldmath$\alpha$ } } ] ^{(0 ) } , \end{aligned}\ ] ] @xmath126 can also be written as @xmath131^{(1 ) } \times { \mbox{\boldmath$\alpha$ } } ] ^{(0 ) } \ , \phi_n ( { \mbox{\boldmath$x$ } } ) .\end{aligned}\ ] ] collecting the answers for @xmath132 and @xmath126 , we finally obtain @xmath133^{(1 ) } \times { \mbox{\boldmath$\alpha$ } } ] ^{(0 ) } \,\phi_n ( { \mbox{\boldmath$x$ } } ) .\end{aligned}\ ] ] up to now , we have obtained the theoretical expressions for the isoscalar combination of the generalized form factors @xmath95 and @xmath96 . for notational convenience , we summarize these results in a little more compact forms as follows : @xmath134 and @xmath135^{(1 ) } \times { \mbox{\boldmath$\alpha$ } } ] ^{(0 ) } \ ,  n \rangle \right\ } . \label{eq : inthe20is}\end{aligned}\ ] ] as pointed out before , @xmath95 is @xmath104 times the isoscalar combination of standard sachs  electric form factor of the nucleon . analogously , we may call @xmath96 the gravitoelectric form factor of the nucleon ( its quark part ) , since it is related to the nonforward nucleon matrix elements of the quark part of the qcd energy momentum tensor . the other generalized form factors can be obtained in a similar way . the isovector part of the generalized electric form factors survive only at the next  to  leading order of @xmath78 . they are given as @xmath136 and @xmath137^{(1 ) } \times { \mbox{\boldmath$\alpha$ } } ] ^{(0 ) } \,{\mbox{\boldmath$\tau$ } } \, n \rangle \right\ } .\end{aligned}\ ] ] the isoscalar combination of the generalized magnetic form factors also survive only at the next  to  leading order of @xmath78 , so that they are given as double sums over the single  quark orbitals in the hedgehog mean field as @xmath138 and @xmath139 we recall that @xmath140 just coincides with the known expression of the isoscalar sachs  magnetic form factor of the nucleon in the cqsm @xcite . on the other hand , @xmath141 is sometimes called the gravitomagnetic form factor of the nucleon ( its isoscalar part ) , which we can evaluate within the qcsm based on the above theoretical expression . finally , the leading  order contribution to the isovector part of the generalized magnetic form factors are given as @xmath142 and @xmath143 especially interesting to us are the values of the generalized form factors in the forward limit @xmath5 . the consideration of this limit is also useful for verifying consistency of our theoretical analyses , since it leads to fundamental sum rules discussed below . we first consider the forward limit of @xmath95 . from ( [ eq : inthe10is ] ) , we find that @xmath144 subtracting the corresponding vacuum contribution , this reduces to @xmath145 . if we remember the relation @xmath146 the forward limit of ( [ eq : inthe10is ] ) just leads to the sum rule : @xmath147 which denotes that the sum of the @xmath69quark and @xmath70quark numbers in the proton is three . next we turn to the forward limit of @xmath96 , which gives @xmath148 it is easy to see that , after regularization and vacuum subtraction , the first term of the rhs of the above equation reduces to the fermion ( quark ) part of the soliton energy , i.e. @xmath149 in ( [ eq : estatic ] ) . it was proved in @xcite that , in the cqsm with vanishing pion mass , the following identity holds : @xmath150 in the case of finite pion mass , which we are handling , this identity does not hold . instead , we can prove ( see appendix ) that @xmath151 that is , the second term in the parenthesis of rhs of eq.([eq : momsum ] ) just coincides with the pion part of the soliton energy ( or mass ) . since the sum of the quark and pion part give the total soliton mass @xmath152 , we then find that @xmath153 in consideration of eq.([eq : inthe20is ] ) , this relation can also be expressed as @xmath154 which means that the total momentum fraction carried by quark fields ( the @xmath69 and @xmath70quarks ) is just unity . this is an expected result , since the cqsm contains quark fields only ( note that the pion is not an independent field of quarks ) , so that the total nucleon momentum should be saturated by the quark fields alone . taking the forward limit of @xmath155 , we are again led to a trivial sum rule , constrained by the conservation low . in fact , we have @xmath156 thereby leading to @xmath157 which denotes that the difference of the @xmath69quark and the @xmath70quark numbers in the proton is just unity . on the other hand , the forward limit of @xmath158 leads to the first nontrivial sum rule as @xmath159 since this quantity , which represents the difference of momentum fraction carried by the @xmath69quark and the @xmath70quark in the proton , is not constrained by any conservation law , its actual value can be estimated only numerically . next we turn to the discussion of the forward limit of the generalized magnetic form factors . first , the forward limit of @xmath140 gives @xmath160 which reproduces the known expression of the isoscalar magnetic moment of the nucleon in the cqsm @xcite . on the other hand , the forward limit of @xmath141 gives @xmath161 it was shown in @xcite that the rhs of the above equation is just unity , i.e. @xmath162 in consideration of ( [ eq : gm2nd ] ) , this identity can be recast into a little different form as @xmath163 \,d x .\end{aligned}\ ] ] assuming the familiar angular momentum sum rule due to ji @xmath164 \,d x = j^{u + d } , \ ] ] the above identity claims that @xmath165 which means that the nucleon spin is saturated by the quark fields alone . this is again a reasonable result , because the cqsm is an effective quark model which contains no explicit gluon fields . the derived identity ( [ eq : gm20ist0 ] ) has still another interpretation . remembering the fact that @xmath141 consists of two parts as @xmath166 eq.([eq : gm20ist0 ] ) dictates that @xmath167 since it also holds that ( the momentum sum rule ) @xmath168 it immediately follows that @xmath169 which is interpreted as showing the absence of the _ net quark contribution to the anomalous gravitomagnetic moment of the nucleon_. finally , we investigate the forward limit of the isovector combination of the generalized magnetic form factors . from eq . ( [ eq : mgivform ] ) , we get @xmath170 which reproduces the known expression of the isovector magnetic form factor of the nucleon in the cqsm . on the other hand , letting @xmath5 in ( [ eq : mg2ivform ] ) , we have @xmath171 as shown in @xcite , this sum rule can be recast into the form : @xmath172 where @xmath173 consists of two parts as @xmath174 here , the first part is given as a proton matrix element of the _ free field expression _ for the isovector total angular momentum operator of quark fields as @xmath175 with @xmath176 \ , \psi ( { \mbox{\boldmath$x$ } } ) \,d^3 x \nonumber \\ & = & \hat{l}_f^{(i = 1 ) } + \frac{1}{2 } \,\hat{\sigma}^{(i = 1 ) } .\end{aligned}\ ] ] on the other hand , the second term is given as @xmath177  n \rangle .\ ] ] the model in the chiral limit contains two parameters , the weak pion decay constant @xmath178 and the dynamical quark mass @xmath11 . as usual , @xmath178 is fixed to its physical value , i.e. @xmath179 . for the mass parameter @xmath11 , there is some argument based on the instanton liquid picture of the qcd vacuum that it is not extremely far from @xmath180 @xcite . the previous phenomenological analysis of various static baryon observables based on this model prefer a slightly larger value of @xmath11 between @xmath180 and @xmath181 @xcite@xcite . in the present analysis , we use the value @xmath182 . with this value of @xmath182 , we prepare self  consistent soliton solutions for seven values of @xmath183 , i.e. @xmath184 and @xmath185 , in order to see the pion mass dependence of the generalized form factors etc . favorable physical predictions of the model will be obtained by using the value of @xmath182 and @xmath186 , since this set gives a self  consistent soliton solution close to the phenomenologically successful one obtained with @xmath187 and @xmath188 in the single  subtraction pauli  villars regularization scheme @xcite@xcite . we first show in fig.[fig : profile ] the soliton profile functions @xmath52 obtained with several values of @xmath183 , i.e. @xmath189 , and @xmath185 . one sees that the spatial size of the soliton profile becomes more and more compact as the pion mass increases . and @xmath190 , and @xmath185.,width=340,height=264 ] we are now ready to show the theoretical predictions of the cqsm for the generalized form factors . since the corresponding lattice predictions are given for the generalized form factors @xmath191 and @xmath192 , which are the generalization of the standard dirac and pauli form factors , we first write down the relations between these form factors and the generalized sachs  type factors , which we have calculated in the cqsm . they are given by @xmath193 \,/\ , ( 1 + \tau ) , \\ a_{20}^{u + d } ( t ) & = & \left[\ , g_{e , 20}^{(i = 0 ) } ( t ) + \tau \ , g_{m , 20}^{(i = 0 ) } ( t ) \right ] \,/\ , ( 1 + \tau ) , \\ a_{10}^{u  d } ( t ) & = & \left[\ , g_{e , 10}^{(i = 1 ) } ( t ) + \tau \ , g_{m , 10}^{(i = 1 ) } ( t ) \right ] \,/\ , ( 1 + \tau ) , \\ a_{20}^{u  d } ( t ) & = & \left[\ , g_{e , 20}^{(i = 1 ) } ( t ) + \tau \ , g_{m , 20}^{(i = 1 ) } ( t ) \right ] \,/\ , ( 1 + \tau ) , \end{aligned}\ ] ] and @xmath194 \,/\ , ( 1 + \tau ) , \\ b_{20}^{u + d } ( t ) & = & \left[\ , g_{m , 10}^{(i = 0 ) } ( t )  g_{e , 20}^{(i = 0 ) } ( t ) \right ] \,/\ , ( 1 + \tau ) , \\ b_{10}^{u  d } ( t ) & = & \left[\ , g_{m , 10}^{(i = 1 ) } ( t )  g_{e , 10}^{(i = 1 ) } ( t ) \right ] \,/\ , ( 1 + \tau ) , \\ b_{20}^{u  d } ( t ) & = & \left[\ , g_{m , 20}^{(i = 1 ) } ( t )  g_{e , 20}^{(i = 1 ) } ( t ) \right ] \,/\ , ( 1 + \tau ) , \end{aligned}\ ] ] where @xmath195 . we recall that @xmath196 and @xmath197 are nothing but the standard dirac and pauli form factors of the nucleon : @xmath198 since the lattice simulations by the lhpc and qcdsf collaborations were carried out in the heavy pion region around @xmath199 and since the simulation in the small pion mass region is hard to perform , we think it interesting to investigate the pion mass dependence of the generalized form factors within the framework of the cqsm . for simplicity , we shall show the pion mass dependence of the generalized form factors at the zero momentum transfer only . we think it enough for our purpose because the generalized form factors at the zero momentum transfer contain the most important information for clarifying the underlying spin structure of the nucleon . at zero momentum transfer , the relations between the generalized dirac and pauli form factors and the generalized sachs  type form factors are simplified to become @xmath200 and @xmath201 and @xmath202 as functions of @xmath183 ( the filled diamonds ) , together with the corresponding lattice predictions . here , the open triangles correspond to the predictions of the lhpc group @xcite , while the open squares to those of the qcdsf collaboration @xcite.,width=604,height=264 ] fig.[fig : a1020is ] shows the predictions of the cqsm for @xmath203 and @xmath202 as functions of @xmath183 , together with the corresponding lattice predictions . as for @xmath203 , the cqsm predictions and the lattice qcd predictions are both independent of @xmath183 and consistent with the constraint of the quark number sum rule : @xmath204 with high numerical precision . turning to @xmath202 , one finds a sizable difference between the predictions of the cqsm and of the lattice qcd . the lattice qcd predicts that @xmath205 which means that only about @xmath206 of the total nucleon momentum is carried by the quark fields , while the rest is borne by the gluon fields . on the other hand , the cqsm predictions for the same quantity is @xmath207 which means that the quark fields saturates the total nucleon momentum . this may certainly be a limitation of an effective quark model , which contains no explicit gluon fields . note , however , that the total quark momentum fraction @xmath202 is a scale dependent quantity . the lattice result corresponds to the energy scale of @xmath208 @xcite , while the cqsm prediction should be taken as that of the model energy scale around @xmath209 @xcite . we shall later make more meaningful comparison by taking care of the scale dependencies of relevant observables . and @xmath210 as functions of @xmath183 , together with the corresponding lattice predictions @xcite,@xcite . the meaning of the symbols are the same as in fig.[fig : a1020is].,width=604,height=264 ] next , in fig.[fig : a1020iv ] , we show the isovector combination of the generalized form factors @xmath211 and @xmath210 . the meaning of the symbols are the same as in fig.[fig : a1020is ] . as for @xmath211 , both the cqsm and the lattice simulation reproduce the quark number sum rule @xmath212 with good prediction . turning to @xmath210 , one observes that the prediction of the cqsm shows somewhat peculiar dependence on the pion mass . starting from a fairly small value in the chiral limit ( @xmath213 ) , it first increases as @xmath183 increases , but as @xmath183 further increases it begins to decrease , thereby showing a tendency to match the lattice prediction in the heavy pion region . very interestingly , letting put aside the absolute value , a similar @xmath183 dependence is also observed in the chiral extrapolation of the lattice prediction for the momentum fraction @xmath214 shown in fig.25 of @xcite . physically , the quantity @xmath210 has a meaning of the difference of the momentum fractions carried by the @xmath69quark and the @xmath70quark . the empirical value for it is @xmath215 @xcite . one sees that the prediction of the cqsm in the chiral limit is not far from this empirical information , although more serious comparison must take account of the scale dependence of @xmath216 . and @xmath217 as functions of @xmath183 , together with the corresponding lattice predictions @xcite,@xcite . the meaning of the symbols are the same as in fig.[fig : a1020is].,width=604,height=264 ] next , shown in fig.[fig : b1020is ] are the cqsm predictions for @xmath218 and @xmath217 . the former quantity is related to the isoscalar combination of the nucleon anomalous magnetic moment as @xmath219 . ( we recall that its empirical value is @xmath220 . ) we find that this quantity is very sensitive to the variation of the pion mass . it appears that the cqsm prediction @xmath221 corresponding to chiral limit underestimates the observation significantly . however , the difference is exaggerated too much in this comparison . in fact , if we carry out a comparison in the total isoscalar magnetic moment of the nucleon @xmath222 , the cqsm in the chiral limit gives @xmath223 in comparison with the observed value @xmath224 . to our knowledge , no theoretical predictions are given for this quantity by either of the lhpc or qcdsf collaborations . the right panel of fig.4 shows the predictions for @xmath217 , which is sometimes called the isoscalar part of the nucleon anomalous gravitomagnetic moment , or alternatively the _ net quark contribution to the nucleon anomalous gravitomagnetic moment_. as already pointed out , the prediction of the cqsm for this quantity is exactly zero : i.e. @xmath169 the explicit numerical calculation also confirms it . it should be recognized that the above result @xmath225 obtained in the cqsm is just a necessary consequence of the _ momentum sum rule _ and the _ total nucleon spin sum rule _ , both of which are saturated by the quark field only in the cqsm as @xmath226 and @xmath227 \ = \ \langle j \rangle^{u + d } \ = \ \frac{1}{2 } .\ ] ] in real qcd , the gluon also contributes to these sum rules , thereby leading to more general identities : @xmath228 + [ a_{20}^g ( 0 ) + b_{20}^g ( 0 ) ] = 1 , \end{aligned}\ ] ] which constrains that only the sum of @xmath217 and @xmath229 is forced to vanish as @xmath230 ( while we neglect here the contributions of other quarks than the @xmath69 and @xmath70quarks , it loses no generality in our discussion below . in fact , to include them , we have only to replace the combination @xmath231 by @xmath232 . ) the above nontrivial identity claims that the net contributions of quark and gluon fields to the anomalous gravitomagnetic moment of the nucleon must be zero . an interesting question is whether the quark and gluon contribution to the anomalous gravitomagnetic moment vanishes separately or they are both large with opposite sign . a perturbative analysis based on a very simple toy model indicates the latter possibility @xcite . on the other hand , a nonperturbative analysis within the framework of the lattice qcd indicates that the net quark contribution to the anomalous gravitomagnetic moment is small or nearly zero , @xmath233 @xcite,@xcite . ( to be more precise , we sees that the prediction of the lhpc collaboration for @xmath217 is slightly negative @xcite , while that of the qcdsf group is slightly positive @xcite . ) this strongly indicates a surprising possibility that the quark and gluon contribution to the anomalous gravitomagnetic moment of the nucleon may separately vanish . worthy of special mention here is an interesting argument given by teryaev some years ago , claiming that the vanishing net quark contributions to the anomalous gravitomagnetic moment of the nucleon , violated in perturbation theory , is expected to be restored in full nonperturbative qcd due to the confinement @xcite,@xcite,@xcite . very interestingly , once it actually happens , it leads to a surprisingly simple result , i.e. the proportionality of the quark momentum and angular momentum fraction @xmath234 as advocated by teryaev @xcite,@xcite,@xcite . a far reaching physical consequence resulting from this observation was extensively discussed in our recent report @xcite . ( see also the discussion at the end of this section . ) and @xmath235 as functions of @xmath183 , together with the corresponding lattice predictions @xcite,@xcite . the meaning of the symbols are the same as in fig.[fig : a1020is].,width=604,height=264 ] next , we show in fig.[fig : b1020iv ] the predictions for the isovector case , i.e. @xmath236 and @xmath235 . we recall first that the quantity @xmath236 represents the isovector combination of the nucleon anomalous magnetic moment @xmath237 , the empirical value of which is known to be @xmath238 . one find that this quantity is extremely sensitive to the variation of the pion mass especially near @xmath239 . this is only natural if one remembers the important role of the pion cloud in the isovector magnetic moment of the nucleon . ( one may notice that the prediction of the cqsm for @xmath240 underestimates a little its empirical value even in the chiral limit . we recall , however , that , within the framework of the cqsm , there is an important @xmath241 correction or the 1st order rotational correction to some kind of isovector quantities like the isovector magnetic moment of the nucleon in question or the axial  vector coupling constant of the nucleon @xcite@xcite . this next  to  leading correction in @xmath241 should also be taken into account in more advanced investigations . ) shown in the right panel of fig.[fig : b1020iv ] is the theoretical predictions for @xmath235 , the half of which can be interpreted as the difference of the total angular momentum carried by the @xmath69quark and the @xmath70quark fields according to ji s angular momentum sum rule @xcite . the cqsm predicts fairly small value for this quantity , in contrast to the lattice predictions of sizable magnitude . it seems that the pion mass dependence rescues this discrepancy only partially . here we argue that , the reason why the cqsm ( in the chiral limit ) gives rather small prediction for this quantity is intimately connected with the characteristic @xmath1 dependence of the quantity @xmath242 , the forward limit of the isovector unpolarized spin  flip gpd of the nucleon . to show it , we first recall that , within the theoretical frame work of the cqsm , @xmath236 as well as @xmath243 are calculated as difference of @xmath244 and @xmath245 and of @xmath246 and @xmath247 , respectively , as @xmath248 although the quantities of the rhs can be calculated directly without recourse to any distribution functions , they can also be evaluated as @xmath1weighted integrals of the corresponding gpds as @xmath249 the distribution function @xmath242 has already been calculated within the cqsm in our recent paper @xcite . as shown there , the dirac sea contribution to this quantity has a sizably large peak around @xmath250 . since this significant peak due to the deformed dirac  sea quarks is approximately symmetric with respect to the reflection @xmath251 , it hardly contributes to the second moment @xmath252 , whereas it gives a sizable contribution to the first moment @xmath253 . the predicted significant peak of @xmath242 around @xmath250 can physically be interpreted as the effects of pion cloud . it can be convinced in several ways . first , we investigate how this behavior of @xmath242 changes as the pion mass is varied . obtained with @xmath182 and @xmath239.,width=340,height=264 ] dependence of @xmath254.,width=604,height=264 ] shown in fig.[fig : eiv_pi0 ] and in fig.[fig : eiv_pi24 ] are the cqsm predictions for @xmath242 with several values of @xmath183 . i.e. @xmath255 , and @xmath256 . one clearly sees that the height of the peak around @xmath250 , due to the deformed dirac  sea quarks , decreases rapidly as @xmath183 increases . this supports our interpretation of this peak as the effects of pion clouds . on the other hand , one also observes that the magnitude of the valence quark contribution , peaked around @xmath257 , gradually increases as @xmath183 becomes large . this behavior of @xmath242 turns out to cause a somewhat unexpected @xmath183 dependence of @xmath253 and @xmath252 . as a function of @xmath183 , the dirac sea contribution to @xmath253 decreases fast , whereas the valence quark contribution to it increases slowly , so that the total @xmath253 becomes a decreasing function of @xmath183 . on the other hand , owing to the approximate odd  function nature of the dirac sea contribution to @xmath258 with respect to @xmath1 , it hardly contributes to @xmath259 independent of the pion mass , while the valence quark contribution to @xmath258 is an increasing function of @xmath183 , thereby leading to the result that the net @xmath259 is a increasing function of @xmath183 . obtained with @xmath182 and @xmath239.,width=340,height=264 ] dependence of @xmath260.,width=604,height=264 ] we can give still another support to the above  mentioned interpretation of the contribution of the dirac  sea quarks . to see it , we first recall that the theoretical unpolarized distribution function @xmath261 appearing in the decomposition @xmath262 also has a sizable peak around @xmath250 due to the deformed dirac  sea quarks . as shown in fig.[fig : fiv_pi0 ] and in fig.[fig : fiv_pi24 ] , this peak is again a rapidly decreasing function of @xmath183 , supporting our interpretation of it as the effects of pion clouds . here , we can say more . we point out that this small@xmath1 behavior of @xmath261 is just what is required by the famous nmc measurement @xcite . to confirm it , first remember that the distribution @xmath263 in the negative @xmath1 region should actually be interpreted as the distribution of antiquarks . to be explicit , it holds that @xmath264 the large and positive value of @xmath261 in the negative @xmath1 region close to @xmath250 means that @xmath265 is negative , i.e. the dominance of the @xmath7quark over the @xmath8quark inside the proton , which has been established by the nmc measurement @xcite . evolved to @xmath266 and @xmath267 in comparison with the hermes and nutev data at the corresponding energy scales @xcite,@xcite.,width=340,height=283 ] shown in fig.[fig : dbdu ] are the predictions of the cqsm for @xmath268 evolved to the high energy scales corresponding to the experimental observation @xcite . ( the theoretical predictions here were obtained with @xmath182 and @xmath269 . ) the model reproduces well the observed behavior of @xmath268 , although the magnitude of the flavor asymmetry in smaller @xmath1 region seems to be slightly overestimated . it is a widely accepted fact that this flavor asymmetry of the sea quark distribution in the proton can physically be understood as the effects of pion cloud at least qualitatively @xcite@xcite . this then supports our interpretation of the effects of the deformed dirac  sea quarks in @xmath242 and @xmath261 as the effects of pion clouds . .the @xmath270 dependencies of @xmath271 , @xmath272 , and @xmath273 in the cqsm with @xmath182 . [ tabivb20 ] [ cols="^,^,^,^",options="header " , ] in this paper , we have investigated the generalized form factors of the nucleon , which will be extracted through near  future measurements of the generalized parton distribution functions , within the framework of the cqsm . a particular emphasis is put on the pion mass dependence as well as the scale dependence of the model predictions , which we compare with the corresponding predictions of the lattice qcd by the lhpc and the qcdsf collaborations carried out in the heavy pion regime around @xmath199 . the generalized form factors contain the ordinary electromagnetic form factors of the nucleon such as the dirac and pauli form factors of the proton and the neutron . we have shown that the cqsm with good chiral symmetry reproduces well the general behaviors of the observed electromagnetic form factors , while the lattice simulations by the above two groups have a tendency to underestimate the electromagnetic sizes of the nucleon . undoubtedly , this can not be unrelated to the fact that the above two lattice simulations were performed with unrealistically heavy pion mass . we have also tried to figure out the underlying spin contents of the nucleon through the analysis of the gravitoelectric and gravitomagnetic form factors of the nucleon , by taking care of the pion mass despondencies as well as of the scale dependencies of the relevant quantities . after taking account of the scale dependencies by means of the qcd evolution equations at the nlo in the @xmath274 scheme , the cqsm predicts , at @xmath275 , that @xmath276 , and @xmath277 , which means that the quark orbital angular momentum carries sizable amount of total nucleon spin even at such a relatively high energy scale . it contradicts the conclusion of the lhpc and qcdsf collaborations indicating that the total orbital angular momentum of quarks is very small or consistent with zero . it should be recognized , however , that the prediction of the cqsm for the total quark angular momentum is not extremely far from the corresponding lattice prediction @xmath278 at the same renormalization scale . the cause of discrepancy can therefore be traced back to the lhpc and qcdsf lattice qcd predictions for the quark spin fraction @xmath279 around 0.6 , which contradicts not only the prediction of the cqsm but also the emc observation . as was shown in our recent paper @xcite , @xmath279 is such a quantity that is extremely sensitive to the variation of the pion mass , especially in the region close to the chiral limit . more serious lattice qcd studies on the @xmath270dependence of @xmath279 is highly desirable . worthy of special mention is the fact that , once we accept a theoretical postulate @xmath280 , i.e , the absence of the net quark contribution to the anomalous gravitomagnetic moment of the nucleon , which is supported by both of the lhpc and qcdsf lattice simulations , we are necessarily led to a surprisingly simple relations , @xmath281 and @xmath282 , i.e. the proportionality of the linear and angular momentum fractions carried by the quarks and the gluons . using these relations , together with the existing empirical information for the unpolarized and the longitudinally polarized pdfs , we can give _ model  independent predictions _ for the quark and gluon contents of the nucleon spin . for instance , with combined use of the mrst2004 fit @xcite and the dns2005 fit @xcite , we obtain @xmath283 , @xmath284 , and @xmath285 at @xmath275 . since @xmath286 ( as well as @xmath287 ) is a rapidly decreasing function of the energy scale , while the scale dependence of @xmath279 is very weak , we must conclude that the former is even more dominant over the latter at the scale below @xmath288 where any low energy models are supposed to hold . the situation is a little more complicated in the flavor  nonsinglet ( or isovector ) channel , because @xmath289 , and also because the cqsm and the lattice qcd give fairly different predictions for @xmath271 . as compared with the lattice prediction for @xmath271 around @xmath290 , the predictions of the cqsm turns out to be around @xmath291 . we have argued that the relatively small value of @xmath246 obtained in the cqsm is intimately connected with the small @xmath1 enhancement of the generalized parton distribution @xmath292 , which is dominated by the clouds of pionic @xmath293 excitation around @xmath294 . ( we recall that the 2nd moment of @xmath292 gives @xmath246 . ) unfortunately , such a @xmath1dependent distribution as @xmath292 can not be accessed within the framework of lattice qcd . still , the predicted small @xmath1 behavior of @xmath295 as well as of @xmath296 indicates again the importance of chiral dynamics in the nucleon structure function physics , which has not been fully accounted for in the lattice qcd simulation at the present level . this work is supported in part by a grant  in  aid for scientific research for ministry of education , culture , sports , science and technology , japan ( no . c16540253 ) here we closely follow the proof of the momentum sum rule given in @xcite , by taking into account a necessary modification in the case of @xmath297 . the starting point is the following expression for the soliton mass ( or the static soliton energy ) : @xmath298 \  \ ( h \rightarrow h_0 ) \ + \ e_m , \ ] ] with @xmath299 \ = \  \,f_\pi^2 \,m_\pi^2 \,\int \ , [ \cos f(r )  1 ] \,d^3 x . \label{eq : emeson}\ ] ] the soliton mass must be stationary with respect to an arbitrary variation of the chiral field @xmath300 or equivalently the soliton profile @xmath52 , which lead to a saddle point equation : @xmath301 \ + \ \delta e_m \ = \ 0 .\ ] ] here we consider a particular ( dilatational ) variation of chiral field @xmath302 for infinitesimal @xmath3 , we have @xmath303 so that @xmath304 \ = \ \xi \ , ( [ x^k \partial_k , h ]  i \gamma^0 \gamma^k \partial_k ) .\end{aligned}\ ] ] noting the identity @xmath305 ) \ = \ \mbox{sp } ( [ h , \theta ( e_0  h + i \varepsilon ) ] x^k \partial_k ) \ = \ 0 , \end{aligned}\ ] ] we therefore obtain a key identity @xmath306 =  \,\delta e_m . \label{eq : keyid}\ ] ] now , by using ( [ eq : emeson ] ) together with the relations , @xmath307 we get @xmath308 here , taking account of the boundary condition @xmath309 we can manipulate as @xmath310 we thus find an important relation : @xmath311 putting this relation into ( [ eq : keyid ] ) , we have @xmath312 or @xmath313 if we evaluate the trace sum above by using the eigenstates of the static dirac hamiltonian @xmath50 as a complete set of basis , ( [ eq : traceap ] ) can also be written as @xmath314 which is the relation quoted in ( [ eq : alfp ] ) . we point out that our result has a correct chiral limit , since @xmath315 as @xmath316 and therefore @xmath317 in conformity with the proof given in ref.@xcite ."  "with a special intention of clarifying the underlying spin contents of the nucleon , we investigate the generalized form factors of the nucleon , which are defined as the @xmath0th @xmath1moments of the generalized parton distribution functions , within the framework of the chiral quark soliton model .
a particular emphasis is put on the pion mass dependence of final predictions , which we shall compare with the predictions of lattice qcd simulations carried out in the so  called heavy pion region around @xmath2 .
we find that some observables are very sensitive to the variation of the pion mass .
it will be argued that the negligible importance of the quark orbital angular momentum indicated by the lhpc and qcdsf lattice collaborations might be true in the unrealistic heavy pion world , but it is not necessarily the case in our real world close to the chiral limit ." 
"let @xmath1 . let @xmath2\longrightarrow [ 0,\pi_{p , q}/2]$ ] be the integral @xmath3 where @xmath4 . the @xmath0_sine functions _ , @xmath5 , $ ] are defined to be the inverses of @xmath6 , @xmath7\ ] ] extended to @xmath8 by the rules @xmath9 which make them periodic , continuous , odd with respect to 0 and even with respect to @xmath10 . these are natural generalisations of the sine function , indeed @xmath11 and they are known to share a number of remarkable properties with their classical counterpart @xcite . among these properties lies the fundamental question of completeness and linear independence of the family @xmath12 where @xmath13 . this question has received some attention recently @xcite , with a particular emphasis on the case @xmath14 . in the latter instance , @xmath15 is the set of eigenfunctions of the generalised eigenvalue problem for the one  dimensional @xmath16laplacian subject to dirichlet boundary conditions @xcite , which is known to be of relevance in the theory of slow / fast diffusion processes , @xcite . see also the related papers @xcite . set @xmath17 , so that @xmath18 is a schauder basis of the banach space @xmath19 for all @xmath20 . the family @xmath21 is also a schauder basis of @xmath22 if and only if the corresponding _ change of coordinates map _ , @xmath23 , extends to a linear homeomorphism of @xmath22 . the fourier coefficients of @xmath24 associated to @xmath25 obey the relation @xmath26 for @xmath27 , let @xmath28 ( note that @xmath29 for @xmath30 ) and let @xmath31 be the linear isometry such that @xmath32 . then @xmath33 so that the change of coordinates takes the form @xmath34 notions of `` nearness '' between bases of banach spaces are known to play a fundamental role in classical mathematical analysis , @xcite , @xcite or @xcite . unfortunately , the expansion strongly suggests that @xmath21 is not globally `` near '' @xmath18 , e.g. in the krein  lyusternik or the paley  wiener sense , @xcite . therefore classical arguments , such as those involving the paley  wiener stability theorem , are unlikely to be directly applicable in the present context . in fact , more rudimentary methods can be invoked in order to examine the invertibility of the change of coordinates map . from it follows that @xmath35 in @xcite it was claimed that the left side of held true for all @xmath36 where @xmath37 was determined to lie in the segment @xmath38 . hence @xmath21 would be a schauder basis , whenever @xmath39 . further developments in this respect were recently reported by bushell and edmunds @xcite . these authors cleverly fixed a gap originally published in ( * ? ? ? * lemma 5 ) and observed that , as the left side of ceases to hold true whenever @xmath40 the argument will break for @xmath14 near @xmath41 . therefore , the basisness question for @xmath21 should be tackled by different means in the regime @xmath42 . more recently @xcite , edmunds , gurka and lang , employed in order to show invertibility of @xmath43 for general pairs @xmath44 , as long as @xmath45 since is guaranteed whenever @xmath46 this allows @xmath47 for @xmath48 . however , note that a direct substitution of @xmath14 in , only leads to the sub  optimal condition @xmath49 . in section [ linearind ] below we show that the family @xmath21 is @xmath50_linearly independent _ for all @xmath1 , see theorem [ likernelandspan ] . in section [ ribap ] we establish conditions ensuring that @xmath43 is a homeomorphism of @xmath51 in a neighbourhood of the region in the @xmath44plane where @xmath52 see theorem [ inprovement ] and also corollary [ beyonda ] . for this purpose , in section [ criteria ] we find two further criteria which generalise in the hilbert space setting , see corollaries [ main_1 ] and [ main_2 ] . in this case , the _ riesz constant _ , @xmath53 characterises how @xmath21 deviates from being an orthonormal basis . these new statements yield upper bounds for @xmath54 , which improve upon those obtained from the right side of , even when the latter is applicable . the formulation of the alternatives to presented below relies crucially on work developed in section [ toep_s ] . from lemma [ multareshifts ] we compute explicitly the wold decomposition of the isometries @xmath31 : they turn out to be shifts of infinite multiplicity . hence we can extract from the expansion suitable components which are toeplitz operators of scalar type acting on appropriate hardy spaces . as the theory becomes quite technical for the case @xmath55 and all the estimates analogous to those reported below would involve a dependence on the parameter @xmath56 , we have chosen to restrict our attention with regards to these improvements only to the already interesting hilbert space setting . section [ casep = q ] is concerned with particular details of the case of equal indices @xmath14 , and it involves results on both the general case @xmath20 and the specific case @xmath57 . rather curiously , we have found another gap which renders incomplete the proof of invertibility of @xmath43 for @xmath58 originally published in @xcite . see remark [ rem_gap ] . moreover , the application of ( * ? ? ? * theorem 4.5 ) only gets to a _ basisness threshold _ of @xmath59 , where @xmath60 is defined by the identity @xmath61 see also ( * ? ? ? * remark 2.1 ) . in theorem [ fixingbbcdg ] we show that @xmath21 is indeed a schauder basis of @xmath22 for @xmath62 where @xmath63 , see ( * ? ? ? * problem 1 ) . as @xmath64 , basisness is now guaranteed for all @xmath65 . see figure [ impro_fig_p = q ] . in section [ nume ] we report on our current knowledge of the different thresholds for invertibility of the change of coordinates map , both in the case of equal indices and otherwise . based on the new criteria found in section [ criteria ] , we formulate a general test of invertibility for @xmath43 which is amenable to analytical and numerical investigation . this test involves finding sharp bounds on the first few coefficients @xmath66 . see proposition [ beyond2 ] . for the case of equal indices , this test indicates that @xmath21 is a riesz basis of @xmath51 for @xmath67 where @xmath68 . all the numerical quantities reported in this paper are accurate up to the last digit shown , which is rounded to the nearest integer . in the appendix we have included fully reproducible computer codes which can be employed to verify the calculations reported . a family @xmath69 in a banach space is called @xmath50linearly independent @xcite , if @xmath70 [ likernelandspan ] for all @xmath1 , the family @xmath21 is @xmath50linearly independent in @xmath22 . moreover , if the linear extension of the map @xmath23 is a bounded operator @xmath71 , then @xmath72 for the first assertion we show that @xmath73 . let @xmath74 be such that @xmath75 where the series is convergent in the norm of @xmath22 . then @xmath76 hence @xmath77 we show that all @xmath78 by means of a double induction argument . suppose that @xmath79 . we prove that all @xmath80 . indeed , clearly @xmath81 from with @xmath82 . now assume inductively that @xmath29 for all @xmath83 . from for @xmath84 we get @xmath85 then @xmath80 for all @xmath86 . as this would contradict the fact that @xmath87 , necessarily @xmath88 . suppose now inductively that @xmath89 and @xmath90 . we prove that again all @xmath80 . firstly , @xmath81 from with @xmath91 , because @xmath92 secondly , assume by induction that @xmath29 for all @xmath93 . from for @xmath94 we get @xmath95 the latter equality is a consequence of the fact that , for @xmath96 with @xmath97 and @xmath98 , either @xmath99 ( indices for the @xmath100 ) or @xmath101 ( indices for the @xmath102 ) . hence @xmath80 for all @xmath86 . as this would again contradict the fact that @xmath87 , necessarily all @xmath103 so that @xmath104 . the second assertion is shown as follows . assume that @xmath105 . if @xmath106 , then @xmath107 for all @xmath108 , so @xmath109 which in turns means that @xmath110 for all @xmath111 . on the other hand , if the latter holds true for @xmath112 , then @xmath113 for all @xmath114 , so @xmath115 , as required . therefore , @xmath21 is a riesz basis of @xmath51 if and only if @xmath105 and @xmath116 . a simple example illustrates how a family of dilated periodic functions can break its property of being a riesz basis . [ ex1 ] let @xmath117 $ ] . take @xmath118 by virtue of lemma [ toep ] below , @xmath119 is a riesz basis of @xmath51 if and only if @xmath120 . for @xmath121 we have an orthonormal set . however it is not complete , as it clearly misses the infinite  dimensional subspace @xmath122 . the fundamental decomposition of @xmath43 given in allows us to extract suitable components formed by toeplitz operators of scalar type , @xcite . in order to identify these components , we begin by determining the wold decomposition of the isometries @xmath31 , @xcite . see remark [ diri ] . [ multareshifts ] for all @xmath123 , @xmath124 is a shift of infinite multiplicity . define @xmath125 then @xmath126 for @xmath127 , @xmath128 , and @xmath129 one  to  one and onto for all @xmath130 . therefore indeed @xmath31 is a shift of multiplicity @xmath131 . let @xmath132 . the hardy spaces of functions in @xmath133 with values in the banach space @xmath134 are denoted below by @xmath135 . let @xmath136 be a holomorphic function on @xmath137 and fix @xmath138 . let @xmath139 let the corresponding toeplitz operator ( * ? ? ? * ( 5  1 ) ) @xmath140 let @xmath141 by virtue of lemma [ multareshifts ] ( see ( * ? ? ? * and 5.2 ) ) , there exists an invertible isometry @xmath142 such that @xmath143 . below we write @xmath144 [ generic_toep ] @xmath145 in is invertible if and only if @xmath146 . moreover @xmath147 observe that @xmath148 is scalar analytic in the sense of @xcite . since @xmath149 is holomorphic in @xmath137 , then @xmath150 and @xmath151 ( * ? ? ? * theorem a(iii ) ) . if @xmath152 , then @xmath153 is also holomorphic in @xmath137 . the scalar toeplitz operator @xmath154 is invertible if and only if @xmath146 . moreover , @xcite , @xmath155 the matrix of @xmath148 has the block representation @xcite @xmath156 the matrix associated to @xmath154 has exactly the same scalar form , replacing @xmath157 by @xmath158 . then , @xmath148 is invertible if and only if @xmath154 is invertible , and @xmath159 hence @xmath160 [ pert ] let @xmath161 for @xmath145 as in . if @xmath162 , then @xmath43 is invertible . moreover @xmath163 since @xmath145 is invertible , write @xmath164 . if additionally @xmath165 , then @xmath166 [ diri ] it is possible to characterise the change of coordinates @xmath43 in terms of dirichlet series , and recover some of the results here and below directly from this characterisation . see for example the insightful paper @xcite and the complete list of references provided in the addendum @xcite . however , the full technology of dirichlet series is not needed in the present context . a further development in this direction will be reported elsewhere . a proof of can be achieved by applying corollary [ pert ] assuming that @xmath167 our next goal is to formulate concrete sufficient condition for the invertibility of @xmath43 and corresponding bounds on @xmath54 , which improve upon whenever @xmath57 . for this purpose we apply corollary [ pert ] assuming that @xmath145 has now the three  term expansion @xmath168 let @xmath169 let @xmath170 see figure [ region ] . optimal region of invertibility in lemma [ toep ] . in this picture the horizontal axis is @xmath171 and the vertical axis is @xmath172.,height=377 ] [ toep ] let @xmath57 . let @xmath173 . the operator @xmath174 is invertible if and only if @xmath175 . moreover @xmath176 let @xmath177 be associated with @xmath145 as in section [ toep_s ] . the first assertion is a consequence of the following observation . if @xmath178 , then @xmath179 has roots @xmath180 conjugate with each other and @xmath181 if and only if @xmath182 . otherwise @xmath179 has two real roots . if @xmath183 and @xmath184 , then the smallest in modulus root of @xmath179 would lie in @xmath137 if and only if @xmath185 . if @xmath186 and @xmath187 , then the root of @xmath179 that is smallest in modulus would lie in @xmath137 if and only if @xmath188 . for the second assertion , let @xmath175 and @xmath189 . by virtue of the maximum principle on @xmath179 and @xmath190 , @xmath191 since @xmath192 then @xmath193 if and only if @xmath194 . for @xmath195 , we get @xmath196 and @xmath197 . for @xmath198 , we get @xmath199 with the condition @xmath200 . by virtue of theorem [ generic_toep ] , we obtain the claimed statement . since @xmath201 for all @xmath202 , then @xmath203 . below we substitute @xmath204 and @xmath205 , then apply lemma [ toep ] appropriately in order to determine the invertibility of @xmath43 whenever pairs @xmath44 lie in different regions of the @xmath44plane . for this purpose we establish the following hierarchy between @xmath206 and @xmath207 for @xmath208 , whenever the latter are non  negative . [ ajint ] for @xmath209 or @xmath210 , we have @xmath211 . firstly observe that @xmath212 is continuous , it increases for all @xmath213 and it vanishes at @xmath214 . let @xmath209 . set @xmath215 { \mathrm{d}}x \qquad \text{and } \\ i_1=\int_{\frac14}^{\frac12 } \sin_{p , q}(\pi_{p , q}x ) [ \sin(\pi x)\sin(3\pi x ) ] { \mathrm{d}}x . \end{gathered}\ ] ] since @xmath216 then @xmath217 and @xmath218 . as @xmath219 is odd with respect to @xmath220 and @xmath221 is increasing in the segment @xmath222 , then also @xmath223 . hence @xmath224 ensuring the first statement of the lemma . let @xmath210 . a straightforward calculation shows that @xmath225 if and only if , either @xmath226 or @xmath227 . thus , @xmath228 has exactly five zeros in the segment @xmath229 $ ] located at : @xmath230 set @xmath231 and @xmath232 { \mathrm{d}}x .\ ] ] then @xmath233 for @xmath234 and @xmath235 for @xmath236 . since @xmath237 for all @xmath238 , then @xmath239 hence @xmath240 the next two corollaries are consequences of corollary [ pert ] and lemma [ toep ] , and are among the main results of this paper . [ main_1 ] @xmath241 let @xmath161 where @xmath242 the top on left side of and the fact that @xmath203 imply @xmath243 thus , the bottom on the left side of yields @xmath244 so indeed @xmath43 is invertible . the estimate on the riesz constant is deduced from the triangle inequality . since @xmath203 , supersedes , only when the pair @xmath44 is such that @xmath245 . from this corollary we see below that the change of coordinates is invertible in a neighbourhood of the threshold set by the condition . see proposition [ beyond2 ] and figures [ impro_fig_p = q ] and [ th10ab ] . [ main_2 ] @xmath246 the proof is similar to that of corollary [ main_1 ] . we see below that corollary [ main_1 ] is slightly more useful than corollary [ main_2 ] in the context of the dilated @xmath0sine functions . however the latter is needed in the proof of the main theorem [ inprovement ] . it is of course natural to ask what consequences can be derived from the other statement in lemma [ toep ] . for @xmath247 we have @xmath248 . hence the same argument as in the proofs of corollaries [ main_1 ] and [ main_2 ] would reduce to , and in this case there is no improvement . our first goal in this section is to establish that the change of coordinates map associated to the family @xmath21 is invertible beyond the region of applicability of . we begin by recalling a calculation which was performed in the proof of ( * ? ? ? * proposition 4.1 ) and which will be invoked several times below . let @xmath249 be the inverse function of @xmath250 . then @xmath251 indeed , integrating by parts twice and changing the variable of integration to @xmath252 yields @xmath253 ' \sin(j\pi x ) { \mathrm{d}}x \\ & = \frac{2 \sqrt{2 } \pi_{p , q}}{j^2\pi^2 } \int_0 ^ 1 \sin \left ( \frac{j \pi}{\pi_{p , q } } a(t)\right ) { \mathrm{d}}t . \end{aligned}\ ] ] [ inprovement ] let @xmath57 . suppose that the pair @xmath254 is such that the following two conditions are satisfied 1 . [ improa ] @xmath255 2 . [ improc ] @xmath256 . then there exists a neighbourhood @xmath257 , such that the change of coordinates @xmath43 is invertible for all @xmath258 . from the dominated convergence theorem , it follows that each @xmath259 is a continuous function of the parameters @xmath16 and @xmath260 . therefore , by virtue of and a further application of the dominated convergence theorem , also @xmath261 is continuous in the parameters @xmath16 and @xmath260 . here @xmath262 can be any fixed set of indices , but below in this proof we only need to consider @xmath263 for the first possibility and @xmath264 for the second possibility . write @xmath265 . the hypothesis implies @xmath266 , because @xmath267 therefore @xmath268 for a suitable neighbourhood @xmath269 . two possibilities are now in place . note that @xmath271 is an immediate consequence of [ improa ] and [ improc ] . by continuity of all quantities involved , there exists a neighbourhood @xmath272 such that the left hand side and hence the right hand side of hold true for all @xmath273 . substitute @xmath275 and @xmath276 . if @xmath277 , then @xmath278 indeed , the conditions on @xmath171 and @xmath172 give @xmath279 as @xmath280 , @xmath281 thus @xmath282 which is . hence @xmath283 thus , once again by continuity of all quantities involved , there exists a neighbourhood @xmath284 such that the left hand side and hence the right hand side of hold true for all @xmath285 . the conclusion follows by defining either @xmath286 or @xmath287 . we now examine other further consequences of the corollaries [ main_1 ] and [ main_2 ] . [ impro_implicit ] any of the following conditions ensure the invertibility of the change of coordinates map @xmath288 . 1 . [ aimplicit ] ( @xmath20 ) : @xmath289 2 . [ bimplicit ] ( @xmath57 ) : @xmath290 , @xmath245 , @xmath291 and @xmath292 3 . [ cimplicit ] ( @xmath57 ) : @xmath290 , @xmath245 , @xmath293 and @xmath294 from , it follows that @xmath295 hence the condition [ aimplicit ] implies that the hypothesis is satisfied . by virtue of lemma [ ajint ] , it is guaranteed that @xmath296 in the settings of [ bimplicit ] or [ cimplicit ] . from , it also follows that @xmath297 combining each one of these assertions with and , respectively , immediately leads to the claimed statement . we recover ( * ? ? ? * corollary 4.3 ) from the part [ aimplicit ] of this theorem by observing that for all @xmath1 , @xmath298 in fact , for @xmath299 , the better estimate @xmath300 ensures invertibility of @xmath43 for all @xmath20 whenever @xmath301 see figures [ th10ab ] and [ th10c ] . we now consider in closer detail the particular case @xmath302 . our analysis requires setting various sharp upper and lower bounds on the coefficients @xmath303 for @xmath304 . this is our first goal . employed to show bound [ a3l ] in lemma [ ajpositive ] . for reference we also show @xmath305 , @xmath306 , @xmath307 and @xmath308 . [ fig : interp ] , width=340 ] [ ajpositive ] 1 . [ a3l ] @xmath309 for all @xmath310 2 . [ a5l ] @xmath311 for all @xmath312 3 . [ a7l ] @xmath313 for all @xmath312 4 . [ a9l ] @xmath314 for all @xmath315 all the stated bounds are determined by integrating a suitable approximation of @xmath316 . each one requires a different set of quadrature points , but the general structure of the arguments in all cases is similar . without further mention , below we repeatedly use the fact that in terms of hypergeometric functions , @xmath317.\ ] ] let @xmath318 for @xmath319 let @xmath320 see figure [ fig : interp ] . since @xmath321 and @xmath322 is an increasing function of @xmath323 , then @xmath324 according to ( * ? ? ? * corollary 4.4 ) , @xmath316 increases as @xmath16 decreases for any fixed @xmath325 . let @xmath16 be as in the hypothesis . then @xmath326 and similarly @xmath327 by virtue of ( * ? ? ? * lemma 3 ) the function @xmath322 is strictly concave for @xmath323 . then , in fact , @xmath328 let @xmath329 since @xmath330 for @xmath331 and @xmath332 , @xmath333 note that @xmath334 set @xmath335 then @xmath336 and so @xmath337 also @xmath338 so @xmath339 let @xmath16 be as in the hypothesis . then , similarly to the previous case [ a3l ] , @xmath340 set @xmath341 by strict concavity and , @xmath342 let @xmath343 then @xmath344 as claimed . let @xmath16 be as in the hypothesis . set @xmath345 then @xmath346 hence @xmath347 put @xmath348 then , @xmath349 let @xmath350 since @xmath351 is negative for @xmath352 and positive for @xmath353 , then @xmath354 . hence @xmath355 note that @xmath356 let @xmath16 be as in the hypothesis . set @xmath357 then @xmath358 hence @xmath347 put @xmath359 then , @xmath360 let @xmath361 then @xmath362 . hence @xmath363 the next statement is a direct consequence of combining [ a3l ] and [ a9l ] from this lemma with theorem [ inprovement ] . [ beyonda ] set @xmath57 and suppose that @xmath364 is such that @xmath365 there exists @xmath366 such that @xmath43 is invertible for all @xmath367 . see figure [ impro_fig_p = q ] . [ rem_gap ] in @xcite it was claimed that the hypothesis of held true whenever @xmath36 for a suitable @xmath368 . the argument supporting this claim @xcite was separated into two cases : @xmath369 and @xmath370 . with our definition by a factor of @xmath371 . note that the ground eigenfunction of the @xmath16laplacian equation in @xcite is denoted by @xmath372 and it equals @xmath373 as defined above . a key observation here is the @xmath16pythagorean identity @xmath374 . ] of the fourier coefficients , in the latter case it was claimed that @xmath375 was bounded above by @xmath376 as it turns , there is a missing power 2 in the term @xmath377 for this claim to be true . this corresponds to taking second derivatives of @xmath378 and it can be seen by applying the cauchy  schwartz inequality in . the missing factor is crucial in the argument and renders the proof of ( * ? ? ? * theorem 1 ) incomplete in the latter case . in the paper @xcite published a few years later , it was claimed that the hypothesis of held true for @xmath379 where @xmath60 is defined by . it was then claimed that an approximated solution of was near @xmath380 . an accurate numerical approximation of , based on analytical bounds on @xmath381 , give the correct digits @xmath382 . therefore neither the results of @xcite nor those of @xcite include a complete proof of invertibility of the change of coordinates in a neighbourhood of @xmath383 . accurate numerical estimation of @xmath381 show that the identity is valid as long as @xmath384 , which improves slightly upon the value @xmath60 from @xcite . however , as remarked in @xcite , the upper bound @xmath385 ensuring and hence the validity of theorem [ impro_implicit][aimplicit ] , is too crude for small values of @xmath16 . note for example that the correct regime is @xmath386 whereas @xmath387 as @xmath388 ( see appendix [ ap1 ] ) . therefore , in order to determine invertibility of @xmath43 in the vicinity of @xmath389 , it is necessary to find sharper bounds for the first few terms @xmath375 , and employ directly . this is the purpose of the next lemma . see figure [ impro_fig_p = q ] . [ ajbounds ] let @xmath312 . then 1 . [ a1l ] @xmath390 2 . [ a3u ] @xmath391 3 . [ a5u ] @xmath392 4 . [ a7u ] @xmath393 we proceed in a similar way as in the proof of lemma [ ajpositive ] . let @xmath16 be as in the hypothesis . set @xmath394 then @xmath395 and so @xmath347 let @xmath396 then , @xmath397 hence @xmath398 set @xmath399 then @xmath400 let @xmath401 then , @xmath402 and hence @xmath403 set @xmath404 and let @xmath405 then , @xmath406 , so @xmath407 set @xmath408 and @xmath409 then , @xmath406 and @xmath410 , so @xmath411 the following result fixes the proof of the claim made in ( * ? ? ? * claim 2 ) and improves the threshold of invertibility determined in ( * ? ? ? * theorem 4.5 ) . [ fixingbbcdg ] there exists @xmath412 , such that @xmath413\pi^2}{2\sqrt{2}\left ( \frac{\pi^2}{8}1 \frac19\frac{1}{25}\frac{1}{49 } \right ) } \qquad \forall p\in\left(p_3 , \frac{6}{5}\right).\ ] ] the family @xmath21 is a schauder basis of @xmath414 for all @xmath415 and @xmath20 . both sides of are continuous functions of the parameter @xmath416 . the right hand side is bounded . the left side is decreasing as @xmath16 increases and @xmath387 as @xmath388 . by virtue of lemma [ ajbounds ] , @xmath417 hence the first statement is ensured as a consequence of the intermediate value theorem . from , it follows that @xmath418 for all @xmath419 . lemma [ ajpositive ] guarantees positivity of @xmath207 for @xmath420 . then , by re  arranging this inequality , the second statement becomes a direct consequence of . a sharp numerical approximation of the solution of the equation with equality in gives @xmath421 . see figure [ impro_fig_p = q ] . if sharp bounds on the first few fourier coefficients @xmath259 are at hand , the approach employed above for the proof of theorem [ fixingbbcdg ] can also be combined with the criteria or . a natural question is whether this would lead to a positive answer to the question of invertibility for @xmath43 , whenever @xmath422 in the case of , we see below that this is indeed the case . the key statement is summarised as follows . [ beyond2 ] let @xmath57 and @xmath423 . suppose that 1 . @xmath290 , @xmath245 and @xmath424 for all other @xmath425 . 2 . @xmath426 . if @xmath427 then @xmath43 is invertible . assume that the hypotheses are satisfied . the combination of and gives @xmath428 then @xmath429 and so the conclusion follows from . we now discuss the connection between the different statements established in the previous sections with those of the papers @xcite , @xcite and @xcite . for this purpose we consider various accurate approximations of @xmath207 and @xmath430 . these approximations are based on the next explicit formulae : @xmath431 and @xmath432 here @xmath433 is the incomplete beta function , @xmath434 is the beta function and @xmath435 is the gamma function . moreover , by considering exactly the steps described in @xcite for the proof of ( * ? ? ? * ( 4.15 ) ) , it follows that @xmath436{\mathrm{d}}x \\ & = \frac{\sqrt{2}}{\pi}\int_0 ^ 1 \log\left[\cot \left ( \frac{\pi}{4 } \ { \mathcal{i}}\ ! \!\left(\frac1q,\frac{p1}{p};x^q\right ) \right)\right ] { \mathrm{d}}x . \end{aligned}\ ] ] . the positions of @xmath37 , @xmath437 and the value of @xmath438 are set only for illustration purposes , as we are only certain that @xmath439 . black indicates relevance to the general case @xmath20 while red indicates relevance for the case @xmath57 . [ impro_fig_p = q],height=241 ] let us begin with the case of equal indices . see figure [ impro_fig_p = q ] . as mentioned in the introduction , @xmath440 for @xmath41 . the condition @xmath441 is fulfilled for all @xmath442 where @xmath443 . the fourier coefficients @xmath444 for all @xmath445 whenever @xmath446 . remarkably we need to get to @xmath447 , for a numerical verification of the conditions of proposition [ beyond2 ] allowing @xmath448 . indeed we remark the following . 1 . for @xmath449 , the condition hold true only for @xmath450 where @xmath451 . 2 . for @xmath447 the condition does hold true for @xmath452 where @xmath68 . this indicates that that the threshold for invertibility of @xmath43 in the hilbert space setting for @xmath14 is at least @xmath453 . now we examine the general case . the graphs shown in figures [ th10ab ] and [ th10c ] correspond to regions in the @xmath44plane near @xmath454 . curves on figure [ th10ab ] that are in red are relevant only to the hilbert space setting @xmath57 . black curves pertain to @xmath20 . figure [ th10ab]_(a ) _ and a blowup shown in figure [ th10ab]_(b ) _ , have two solid ( black ) lines . one that shows the limit of applicability of theorem [ impro_implicit][aimplicit ] and one that shows the limit of applicability of the result of @xcite . the dashed line indicates where occurs . to the left of that curve is not applicable . there are two filled regions of different colours in _ ( a ) _ , which indicate where @xmath455 and where @xmath456 for @xmath208 . proposition [ beyond2 ] is not applicable in the union of these regions . we also show the lines where @xmath457 and @xmath458 . the latter forms part of the boundary of this union . the solid red line corresponding to the limit of applicability of theorem [ impro_implicit][bimplicit ] is also included in figure [ th10ab]_(a)(d)_. to the right of that line , in the white area , we know that @xmath43 is invertible for @xmath57 . the blowup in figure [ th10ab]_(b ) _ clearly shows the gap between theorem [ impro_implicit][aimplicit ] and theorem [ impro_implicit][bimplicit ] in this @xmath57 setting . certainly @xmath459 is a point of intersection for all curves where @xmath29 for @xmath123 . these curves are shown in figure [ th10ab]_(c ) _ also for @xmath460 and @xmath461 . in this figure , we also include the boundary of the region where @xmath455 and the region where @xmath456 now for @xmath462 . note that the curves for @xmath463 and @xmath458 form part of the boundary of the latter . comparing _ ( a ) _ and _ ( c ) _ , the new line that cuts the @xmath16 axis at @xmath464 corresponds to the limit of where proposition [ beyond2 ] for @xmath465 is applicable ( for @xmath16 to the right of this line ) . the gap between the two red lines ( case @xmath57 ) indicates that proposition [ beyond2 ] can significantly improve the threshold for basisness with respect to a direct application of theorem [ impro_implicit][bimplicit ] . as we increase @xmath466 , the boundary of the corresponding region moves to the left , see the blowups in figure [ th10ab]_(d ) _ and _ ( e)_. the two further curves in red located very close to the vertical axis , correspond to the precise value of the parameter @xmath466 where proposition [ beyond2 ] allows a proof of invertibility for the change of coordinates which includes the break made by . for @xmath467 the region does not include the dashed black line , for @xmath447 it does include this line . the region shown in blue indicates a possible place where corollary [ main_1 ] may still apply , but further investigation in this respect is needed . figure [ th10c ] concerns the statement of theorem [ impro_implicit][cimplicit ] . the small wedge shown in green is the only place where the former is applicable . as it turns , it appears that the conditions of corollary [ main_2 ] prevent it to be useful for determining invertibility of @xmath43 in a neighbourhood of @xmath454 . however in the region shown in green , the upper bound on the riesz constant consequence of is sharper than that obtained from . part of the difficulties for a proof of basisness for the family @xmath21 in the regime @xmath470 has to do the fact that the fourier coefficients of @xmath471 approach those of the function @xmath472 . in this appendix we show that , indeed @xmath473 note that @xmath474 let @xmath475 be the ( unique ) value , such that @xmath476 then @xmath477 let @xmath478 be the line passing through the points @xmath479 and @xmath480 . there exists a unique value @xmath481 such that @xmath482 this value is unique because of monotonicity of both sides of this equality , and it exists by bisection . as all the functions involved are continuous in @xmath16 , then also @xmath483 is continuous in the parameter @xmath16 . moreover , @xmath484 indeed , by clearing the equation defining @xmath483 , we get @xmath485 the right hand side , and thus the left hand side , approach 0 as @xmath388 . then , one ( and hence both ) of the two terms multiplying on the left should approach 0 . let @xmath486 be the polygon which has as vertices ( ordered clockwise ) @xmath487 as @xmath488 @xmath489\times\{0\ } ) \cup ( \{1\}\times [ 0,1])$ ] in hausdorff distance . then the area of @xmath486 approaches 0 as @xmath388 . moreover , @xmath486 covers the graph of @xmath490 for @xmath491 . thus @xmath492 hence , there is a point @xmath493 on the graph of @xmath494 such that @xmath495 and @xmath496 the proof of is completed from the fact that , as @xmath494 is concave ( because its inverse function is convex ) , the piecewise linear interpolant of @xmath494 for the family of nodes @xmath497 has a graph below that of @xmath494 . the following computer codes written in the open source languages octave and python can be used to verify any of the numerical estimations presented in this paper . function for computing @xmath207 with 10digits precision . .... #  function file : [ a , err , np]=apq(k , p , q ) # a is the kth fourier coefficient of the p , q sine function # err is the residual # np number of quadrature points # function [ a , err , np]=apq(k , p , q ) if mod(k,2)==0 , disp('error : k should be odd ' ) ; return ; end [ i , err , np]=quadcc(@(y ) cos(k*pi*betainc(y.^q,1/q,(p1)/p)/2),0,1,1e10 ) ; a = i*2*sqrt(2)/k / pi ; .... function for computing @xmath498 with 10digits precision . .... #  function file : [ s , err , np]=apqsum(k , p , q ) # s is the sum of the fourier coefficient of the p , q sine function # err is the residual # np number of quadrature points # function [ s , err , np]=apqsum(p , q ) [ i , err , np]=quadcc(@(y ) log(cot(pi*betainc(y.^q,1/q,(p1)/p)/4)),0,1,1e10 ) ; s = i*sqrt(2)/pi ; .... function for computing @xmath207 with variable precision . .... def a(k , p , q ) : " " " computes the kth fourier coefficient of the p , q sine function . returns coefficient and residual . > > > from sympy.mpmath import * > > > mp.dps = 25 ; mp.pretty = true > > > a(1,mpf(12)/11,mpf(12)/11 ) > > > ( 0.8877665848468607372062737 , 1.0e59 ) " " " if isint(fraction(k,2 ) ) : apq=0 ; e=0 ; return apq , e f= lambda x : cos(k*pi*betainc(1/q,(p1)/p,0,x**q , regularized = true)/2 ) ; ( i , e)=quad(f,[0,1],error = true , maxdegree=10 ) ; apq = i*2*sqrt(2)/k / pi ; return apq , e .... function for computing @xmath498 with variable precision . .... def suma(p , q ) : " " " computes the sum of the fourier coefficient of the p , q sine function . returns sum and residual . > > > from sympy.mpmath import * > > > mp.dps = 25 ; mp.pretty = true > > > suma(mpf(12)/11,mpf(12)/11 ) > > > ( 1.48634943002852603038783 , 1.0e56 ) " " " f= lambda x : log(cot(pi*betainc(1/q,(p1)/p,0,x**q , regularized = true)/4 ) ) ; ( i , e)=quad(f,[0,1],error = true , maxdegree=10 ) ; sumapq = i*sqrt(2)/pi ; return sumapq , e .... the authors wish to express their gratitude to paul binding who suggested this problem a few years back . they are also kindly grateful with stefania marcantognini for her insightful comments during the preparation of this manuscript . we acknowledge support by the british engineering and physical sciences research council ( ep / i00761x/1 ) , the research support fund of the edinburgh mathematical society and the instituto venezolano de investigaciones cientficas . plane where theorem [ impro_implicit][aimplicit ] and [ bimplicit ] , as well as proposition [ beyond2 ] ( with different values of @xmath466 ) apply . in all graphs @xmath16 corresponds to the horizontal axis and @xmath260 to the vertical axis and the dotted line shows @xmath14 . [ th10ab],title="fig:",height=226 ] plane where theorem [ impro_implicit][aimplicit ] and [ bimplicit ] , as well as proposition [ beyond2 ] ( with different values of @xmath466 ) apply . in all graphs @xmath16 corresponds to the horizontal axis and @xmath260 to the vertical axis and the dotted line shows @xmath14 . [ th10ab],title="fig:",height=226 ] plane where theorem [ impro_implicit][aimplicit ] and [ bimplicit ] , as well as proposition [ beyond2 ] ( with different values of @xmath466 ) apply . in all graphs @xmath16 corresponds to the horizontal axis and @xmath260 to the vertical axis and the dotted line shows @xmath14 . [ th10ab],title="fig:",height=226 ] plane where theorem [ impro_implicit][aimplicit ] and [ bimplicit ] , as well as proposition [ beyond2 ] ( with different values of @xmath466 ) apply . in all graphs @xmath16 corresponds to the horizontal axis and @xmath260 to the vertical axis and the dotted line shows @xmath14 . [ th10ab],title="fig:",height=226 ] plane where theorem [ impro_implicit][aimplicit ] and [ bimplicit ] , as well as proposition [ beyond2 ] ( with different values of @xmath466 ) apply . in all graphs @xmath16 corresponds to the horizontal axis and @xmath260 to the vertical axis and the dotted line shows @xmath14 . [ th10ab],title="fig:",height=226 ] plane where theorem [ impro_implicit][aimplicit ] and [ bimplicit ] , as well as proposition [ beyond2 ] ( with different values of @xmath466 ) apply . in all graphs @xmath16 corresponds to the horizontal axis and @xmath260 to the vertical axis and the dotted line shows @xmath14 . [ th10ab],title="fig:",height=226 ] plane where theorem [ impro_implicit][cimplicit ] applies . even when we know @xmath43 is invertible in this region as a consequence of theorem [ impro_implicit][aimplicit ] , the upper bound on the riesz constant provided by improves upon that provided by ( case @xmath57 ) . in this graph @xmath16 corresponds to the horizontal axis and @xmath260 to the vertical axis and the dotted line shows @xmath14 . [ th10c],title="fig:",height=226 ] plane where theorem [ impro_implicit][cimplicit ] applies . even when we know @xmath43 is invertible in this region as a consequence of theorem [ impro_implicit][aimplicit ] , the upper bound on the riesz constant provided by improves upon that provided by ( case @xmath57 ) . in this graph @xmath16 corresponds to the horizontal axis and @xmath260 to the vertical axis and the dotted line shows @xmath14 . [ th10c],title="fig:",height=226 ]"  "we improve the currently known thresholds for basisness of the family of periodically dilated @xmath0sine functions .
our findings rely on a beurling decomposition of the corresponding change of coordinates in terms of shift operators of infinite multiplicity .
we also determine refined bounds on the riesz constant associated to this family .
these results seal mathematical gaps in the existing literature on the subject ." 
"the simulations we discuss here allowed us to obtain spectra of the shg response . we employed coms(...TRUNCATED)  "we report on strong enhancement of mid  infrared second harmonic generation ( shg ) from sic nanop(...TRUNCATED) 
"with significant research efforts being directed to the development of neurocomputers based on the (...TRUNCATED)  "synaptic memory is considered to be the main element responsible for learning and cognition in huma(...TRUNCATED) 
"the segmentation process as a whole can be thought of as consisting of two tasks : recognition and (...TRUNCATED)  "this paper investigates , using prior shape models and the concept of ball scale ( b  scale ) , wa(...TRUNCATED) 
"one surprising result that has come out of the more than 200 extrasolar planet discoveries to date (...TRUNCATED)  "long time coverage and high radial velocity precision have allowed for the discovery of additional (...TRUNCATED) 
Dataset for summarization of long documents.
Adapted from this repo.
Note that original data are pretokenized so this dataset returns " ".join(text) and add "\n" for paragraphs.
This dataset is compatible with the run_summarization.py
script from Transformers if you add this line to the summarization_name_mapping
variable:
"ccdv/arxivsummarization": ("article", "abstract")
id
: paper idarticle
: a string containing the body of the paperabstract
: a string containing the abstract of the paperThis dataset has 3 splits: train, validation, and test.
Token counts are white space based.
Dataset Split  Number of Instances  Avg. tokens 

Train  203,037  6038 / 299 
Validation  6,436  5894 / 172 
Test  6,440  5905 / 174 
@inproceedings{cohanetal2018discourse,
title = "A DiscourseAware Attention Model for Abstractive Summarization of Long Documents",
author = "Cohan, Arman and
Dernoncourt, Franck and
Kim, Doo Soon and
Bui, Trung and
Kim, Seokhwan and
Chang, Walter and
Goharian, Nazli",
booktitle = "Proceedings of the 2018 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)",
month = jun,
year = "2018",
address = "New Orleans, Louisiana",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/N182097",
doi = "10.18653/v1/N182097",
pages = "615621",
abstract = "Neural abstractive summarization models have led to promising results in summarizing relatively short documents. We propose the first model for abstractive summarization of single, longerform documents (e.g., research papers). Our approach consists of a new hierarchical encoder that models the discourse structure of a document, and an attentive discourseaware decoder to generate the summary. Empirical results on two largescale datasets of scientific papers show that our model significantly outperforms stateoftheart models.",
}