handouts/Advanced/Stopping Problems/parts/3 orderstat.tex

\section{Another Secretary Problem}

As you may have already noticed, the secretary problem we discussed in the previous section
is somewhat disconnected from reality. Under what circumstances would one only be satisfied
with the \textit{absolute best} candidate? It may make more sense to maximize the average rank
of the candidate we hire, rather than the probability of selecting the best. This is the problem
we'll attempt to solve next.


\definition{}
The problem we're solving is summarized below.
Note that this is nearly identical to the classical secretary problem in the previous
section---the only thing that has changed is the goal.
\begin{itemize}
	\item We have exactly one position to fill, and we must fill it with one of $n$ applicants.
	\item These $n$ applicants, if put together, can be ranked unambiguously from \say{best} to \say{worst}.
	\item We interview applicants in a random order, one at a time.
	\item After each interview, we either reject or select the applicant.
	\item We cannot return to an applicant we've rejected.
	\item The process ends once we select an applicant.

	\vspace{2mm}

	\item Our goal is to maximize the rank of the applicant we hire.
\end{itemize}


\definition{}<mod>
Just like before, we need to restate this problem in the language of probability. \par
To do this, we'll say that each candidate has a \textit{quality} rating in $[0, 1]$. \par

\vspace{2mm}

Our series of applicants then becomes a series of random variables $\mathcal{X}_1, \mathcal{X}_2, ..., \mathcal{X}_n$, \par
where each $\mathcal{X}_i$ is drawn uniformly from $[0, 1]$.

\problem{}<notsatisfy>
The modification in \ref{mod} doesn't fully satisfy the constraints of the secretary problem. \par
Why not?

\begin{solution}
	If we observe $\mathcal{X}_i$ directly, we obtain \textit{absolute} scores. \par
	This is more information than the secretary problem allows us to have---we can know which of
	two candidates is better, but \textit{not by how much}.
\end{solution}

\vfill

Ignore this issue for now. We'll return to it later.

\problem{}
Let $\mathcal{X}$ be a random variable uniformly distributed over $[0, 1]$. \par
Given a real number $x$, what is the probability that $\mathcal{P}(\mathcal{X} \leq x)$?


\begin{solution}
	\begin{equation*}
		\mathcal{P}(\mathcal{X} \leq x) =
		\begin{cases}
			0 & x \leq 0 \\
			x & 0 < x < 1 \\
			1 & \text{otherwise}
		\end{cases}
	\end{equation*}

\end{solution}

\vfill

\problem{}
Say we have five random variables $\mathcal{X}_1, \mathcal{X}_2, ..., \mathcal{X}_5$. \par
Given some $y$, what is the probability that all five $\mathcal{X}_i$ are smaller than $y$?

\begin{solution}
	Naturally, this is $\mathcal{P}(\mathcal{X} \leq y)^5$, which is $y^5$.
\end{solution}

\vfill
\pagebreak


%
% MARK: Page
%


\definition{}
Say we have a random variable $\mathcal{X}$ which we observe $n$ times. \note{(for example, we repeatedly roll a die)}
We'll arrange these observations in increasing order, labeled $x_1 < x_2 < ... < x_n$. \par
Under this definition, $x_i$ is called the \textit{$i^\text{th}$ order statistic}---the $i^\text{th}$ smallest sample of $\mathcal{X}$.

\problem{}<ostatone>
Say we have a random variable $\mathcal{X}$ uniformly distributed on $[0, 1]$, of which we take $5$ observations. \par
Given some $y$, what is the probability that $x_5 < y$? How about $x_4 <y $?

\begin{solution}
	$x_5 < y$: ~This is a restatement of the previous problem.

	\vspace{2mm}

	$x_4 < y$: ~We need 4 measurements to be smaller,
	and one to be larger. Accounting for permutations, we get
	$
		5\mathcal{P}(\mathcal{X} \leq y)^4
		\mathcal{P}(\mathcal{X} > y)
		+
		\mathcal{P}(\mathcal{X} \leq y)^5
	$, which is $5y^4(1-y) + y^5$.
\end{solution}

\vfill

\problem{}
Consider the same setup as \ref{ostatone}, but with $n$ measurements. \par
What is the probability that $x_i < y$ for a given $y$?

\begin{solution}
	\begin{equation*}
		\mathcal{P}(x_i < y)
		~=~
		\sum_{j=i}^{n}
		\binom{n}{j} \times
		y^j
		(1-y)^{n-j}
	\end{equation*}
\end{solution}

\vfill

\remark{}
The expected value of the $i^\text{th}$ order statistic on $n$ samples of the uniform distribution is below.
\begin{equation*}
	\mathcal{E}(x_i) = \frac{i}{n+1}
\end{equation*}
We do not have the tools to derive this yet.

\pagebreak

%
% MARK: Page
%


\definition{}
Recall \ref{notsatisfy}. We need one more modification. \par
In order to preserve the constraints of the problem, we will not be allowed to observe $\mathcal{X}_i$ directly. \par
Instead, we'll be given an \say{indicator} $\mathcal{I}_i$ for each $\mathcal{X}_i$, which produces values in $\{0, 1\}$. \par
If the value we observe when interviewing $\mathcal{X}_i$ is the best we've seen so far, $\mathcal{I}_i$ will produce $1$. \par
If it isn't, $\mathcal{I}_i$ produces $0$.

\problem{}
Given a secretary problem with $n$ applicants, what is $\mathcal{E}(\mathcal{I}_i)$?

\begin{solution}
	\begin{equation*}
		\mathcal{E}(\mathcal{I}_i) = \frac{1}{i}
	\end{equation*}
\end{solution}

\vfill


\problem{}
What is $\mathcal{E}(\mathcal{X}_i ~|~ \mathcal{I}_i = 1)$? \par
In other words, what is the expected value of $\mathcal{X}_i$ given that \par
we know this candidate is the best we've seen so far?

\begin{solution}
	This is simply the expected value of the $i^\text{th}$ order statistic on $i$ samples:
	\begin{equation*}
		\mathcal{E}(\mathcal{X}_i ~|~ \mathcal{I}_i = 1) = \frac{i}{i+1}
	\end{equation*}
\end{solution}


\vfill
\pagebreak


\problem{}
In the previous section, we found that the optimal strategy for the classical secretary problem is to
reject the first $e^{-1} \times n$ candidates, and select the next \say{best-yet} candidate we see. \par

\vspace{2mm}

How effective is this strategy for the ranked secretary problem? \par
Find the expected rank of the applicant we select using this strategy.


\vfill

\problem{}
Assuming we use the same kind of strategy as before (reject $k$, select the next \say{best-yet} candidate), \par
show that $k = \sqrt{n}$ optimizes the expected rank of the candidate we select.

\begin{solution}
	This is a difficult bonus problem. see
	\texttt{Neil Bearden, J. (2006). A new secretary problem with rank-based selection and cardinal payoffs.}
\end{solution}

\vfill
\pagebreak