✧*。٩(ˊᗜˋ*)و✧*。 白麓的 web-log

About Order Statistic

Given random variable X~f, we can get a sample of size n. At times, the cost of sort can be lower than that of calculation, we can get the so called order statistics by sorting the sample in ascending order: X1,X2,,XnX(1),X(2),,X(n). By this, we lose all information of the sequence of draw.

Cumulative and Distribution Function

Notice that X(k)x means that at least k observations of the sample is less than x, we have F(k)(x)=P(X(k)x)=jk(nj)F(x)j(1F(x))nj. By differentiating F(k)(x), we can obtain its density function: f(k)(x)=j=knn!j!(nj)!(jF(x)j1(1F(x))njf(x)(nj)F(x)j(1F(x))nj1f(x)), which can be simplified into f(k)=n!(k1)!(nk)!F(x)k1(1F(x))nkf(x). Heuristically, this can be understood as selecting k1 elements and ensure that they are less than X(k) and then select nk elements and ensure that they are greater than X(k), and in the end multiply the density of the k-th X.

X(k) as an Estimator

An interesting property of X(k) is that its an unbiased estimator of the k-th (n+1)-quantile of f. To deal with quantile, a lemma may come handy:

Y=F(X)~U(0,1) for any continuous distribution X~f, where F=Rfd!x.

This is easy to prove: P(Yy)=P(XF1(y))=F(F1(y))=y, thus Y~U(0,1). We now can proceed to prove that E(Y(k))=kn+1: E(Y(k))=01n!(k1)!(nk)!yk1(1y)nkyd!y=Γ(n+1)Γ(k)Γ(nk+1)B(k+1,nk+1)=kn+1. Now notice that Y(k)=F(X(k)), P(XE(X(k)))=F(E(F1(Y(k))))=kn+1. Admittedly the last step is very clumsy, will need to be fixed.

#"hs-maths"