Papers
Topics
Authors
Recent
Search
2000 character limit reached

Minimal Maximum Expected Length

Updated 2 January 2026
  • Minimal maximum expected length is an extremal problem that studies expected length functionals in both prefix coding and longest common subsequence settings.
  • It employs key methodologies including Schur-concavity, spectral analysis, and quadratic programming to derive optimal bounds and distributions.
  • The analysis reveals that uniform distributions maximize expected codeword length in prefix codes, while specific non-uniform distributions can minimize expected LCS in random permutations.

Minimal maximum expected length refers to extremal problems involving the expected value of a length functional in probabilistic or information-theoretic settings. Two prominent lines of research address these questions. The first investigates the maximal minimal expected codeword length for prefix codes under variable source distributions, focusing on Schur-concavity and code-tree constructions. The second explores the minimum expected length of the longest common subsequence (LCS) between two i.i.d. random permutations drawn from an optimally chosen distribution, using spectral and combinatorial techniques.

1. Minimum Expected Length in Prefix Coding

Let Pn\mathcal{P}_n denote the set of all probability mass functions (PMFs) (p1,p2,,pn)(p_1,p_2,\ldots,p_n) on nn symbols, with pi>0p_i > 0 and i=1npi=1\sum_{i=1}^n p_i=1. For an integer D2D \geq 2, let LD(P)\mathcal{L}_D(P) be the minimum expected codeword length of a DD-ary prefix code for the discrete memoryless source PP. For codeword length vector =(1,,n)Z0n\ell = (\ell_1,\ldots,\ell_n) \in \mathbb{Z}_{\geq 0}^n satisfying the D-ary Kraft inequality i=1nDi1\sum_{i=1}^{n} D^{-\ell_i} \leq 1, the functional

LD(P)=min:Di1i=1npiiL_D(P) = \min_{\ell: \, \sum D^{-\ell_i}\leq 1} \sum_{i=1}^{n} p_i\,\ell_i

gives the minimal expected length over all prefix codes. For each PP, the minimum is achieved by a Huffman code (Manickam, 2019).

2. Maximal Minimal Expected Length and Its Attaining Distributions

The mapping LD()L_D(\cdot) is Schur-concave: it attains its maximum at the uniform distribution Un=(1/n,1/n,...,1/n)U_n = (1/n, 1/n, ..., 1/n). Its maximal value is determined as follows. Let mm be the unique integer such that Dmn<Dm+1D^m \leq n < D^{m+1}.

  • If n=Dmn = D^m, LD(Un)=mL_D(U_n) = m; all codeword lengths are mm.
  • If Dm<n<Dm+1D^m < n < D^{m+1}, LD(Un)=logDnL_D(U_n) = \lceil \log_D n \rceil; the optimal code uses two codeword lengths, with DmtD^m-t codewords of length mm and tt codewords of length m+1m+1, tt chosen to ensure Kraft's inequality is tight.

Consequently,

maxPPnLD(P)={mif n=Dm logDnotherwise\max_{P \in \mathcal{P}_n} L_D(P) = \begin{cases} m & \text{if } n = D^m \ \lceil \log_D n \rceil & \text{otherwise} \end{cases}

If nDmn \neq D^m, UnU_n is the unique maximizer; any deviation reduces LD(P)L_D(P) due to strict Schur-concavity. If n=Dmn = D^m, all PP for which the smallest DD probabilities sum to at least the largest, i.e., i=nD+1npip1\sum_{i=n-D+1}^{n} p_i \geq p_1, also attain the bound; these correspond to distributions admitting a full DD-ary code tree of depth mm (Manickam, 2019).

Case Maximizing Distributions Maximum Value
nDmn \neq D^m UnU_n only logDn\lceil \log_D n \rceil
n=Dmn = D^m PP with i=nD+1npip1\sum_{i=n-D+1}^n p_i \geq p_1 mm

3. Minimal Expected Length for Random LCS of Permutations

Let SnS_n be the set of permutations of [n][n], and L(π,σ)\mathcal{L}(\pi, \sigma) denote the length of the longest common subsequence (LCS) of π,σSn\pi, \sigma \in S_n. Given a probability distribution μ\mu on SnS_n, let π,σ\pi, \sigma be i.i.d. from μ\mu. Define

Emin(n)=minμEπ,σμ[L(π,σ)]E_{\min}(n) = \min_{\mu} \mathbb{E}_{\pi, \sigma \sim \mu}[\mathcal{L}(\pi, \sigma)]

This corresponds to the minimal possible expected LCS length when the underlying distribution μ\mu is optimized (Houdré et al., 2017).

The expectation can be written as Eπ,σμ[L(π,σ)]=PTL(n)PE_{\pi,\sigma \sim \mu}[\mathcal{L}(\pi,\sigma)] = P^T L^{(n)} P, where L(n)L^{(n)} is the matrix with entries ij=L(πi,πj)\ell_{ij} = \mathcal{L}(\pi_i,\pi_j).

4. Uniform vs. Non-Uniform Distributions in LCS Problems

Contrary to the coding context, the uniform distribution U=(1/n!,...,1/n!)U = (1/n! ,..., 1/n!) does not always minimize E[L(π,σ)]\mathbb{E}[ \mathcal{L}(\pi, \sigma) ]. For n4n \geq 4, L(n)L^{(n)} has a strictly negative eigenvalue. Thus, P0=U+cR1(n)P_0 = U + c R_1^{(n)} for a unit-norm eigenvector R1(n)R_1^{(n)} and c>0c>0 sufficiently small gives a distribution P0P_0 with EP0[L(π,σ)]<EU[L(π,σ)]E_{P_0}[\mathcal{L}(\pi,\sigma)] < E_U[\mathcal{L}(\pi,\sigma)]. For n=2,3n=2,3, the uniform distribution does minimize the expected LCS (Houdré et al., 2017).

5. Lower Bounds, Techniques, and Conjectures

Using the inequality of Beame–Huynh-Ngoc, for any triple π1,π2,π3Sn\pi_1, \pi_2, \pi_3 \in S_n,

L(π1,π2)L(π2,π3)L(π3,π1)n\mathcal{L}(\pi_1, \pi_2) \cdot \mathcal{L}(\pi_2, \pi_3) \cdot \mathcal{L}(\pi_3, \pi_1) \geq n

which, via AM–GM averaging, yields the cubic-root lower bound Emin(n)n1/3E_{\min}(n) \geq n^{1/3}. The Bukh–Zhou conjecture proposes a universal lower bound of cnc \sqrt{n} for the minimum expected LCS, but only the cubic-root bound is established to date. The main methodologies involve quadratic programming, spectral analysis, eigenvalue interlacing, and combinatorial product inequalities (Houdré et al., 2017). When μ\mu is uniform, E[LIS(σ)]2nE[\text{LIS}(\sigma)] \sim 2\sqrt{n}, suggesting but not proving the n\sqrt{n} lower bound for minimax LCS.

6. Significance and Implications

The extremal characterization of the minimum expected codeword length guides source coding design, particularly in optimal code assignment problems and information theory. The precise conditions under which uniform or specific non-uniform PMFs maximize this expectation clarify the interplay between source entropy, code structure, and expected description length. In the LCS context, the finding that non-uniform distributions can minimize expectation—contrasting with classical random coding results—has implications for combinatorial optimization and complexity theory. The techniques deployed, particularly spectral and quadratic programming, provide templates for analyzing related symmetric functionals over probability simplices, while combinatorial inequalities like the Beame–Huynh-Ngoc lemma demonstrate the structural richness of LCS problems. These results establish sharp boundaries between achievable extremal values and guide future inquiries toward tighter probabilistic and combinatorial bounds (Manickam, 2019, Houdré et al., 2017).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Minimal Maximum Expected Length.