Papers
Topics
Authors
Recent
Search
2000 character limit reached

Preference-Perceptron: Interactive Preference Learning

Updated 13 February 2026
  • Preference-Perceptron is a class of algorithms that extend perceptron learning to interactive preference elicitation and collaborative filtering, integrating user choice feedback.
  • It employs both neural and linear models with online updates and MILP-based query selection to manage high-dimensional and combinatorial data.
  • The method offers scalability, fast convergence, and robust theoretical guarantees, proving effective in applications such as trip planning and PC configuration.

The Preference-Perceptron is a class of algorithms that extends perceptron-style online learning to problems of interactive preference elicitation or collaborative filtering, where user preferences inform the weight updates. These algorithms bridge the gap between classical logistic regression-based recommendation and more expressive neural or linear models capable of accommodating noisy, partial-preference feedback, especially in high-dimensional or combinatorial domains (Chakraborty, 2024, Dragone et al., 2017).

1. Model Definitions and Problem Settings

Preference-Perceptron algorithms are designed to infer personalized utility functions for users based on their observed preferences across items, configurations, or combinatorial candidates.

  • In classical collaborative filtering, the system seeks to learn a function f(x;W,b)f(x;W,b) (possibly via an MLP, i.e., multilayer perceptron) that predicts an individual user's preference yy (binary or real-valued) given feature representation xRnx \in \mathbb{R}^n of an item or configuration (Chakraborty, 2024).
  • In combinatorial preference elicitation, each round tt consists of a context xtx^t, a set of queried candidates Qt={y1,,yk}\mathcal{Q}^t = \{y_1, \ldots, y_k\}, and partial feedback in the form of a single chosen item yˉt\bar{y}^t (Dragone et al., 2017).

For each domain:

  • The true user utility (latent) is assumed linear: u(x,y)=w,ϕ(x,y)u^*(x, y) = \langle \mathbf{w}^*, \phi(x, y) \rangle with a hybrid feature map ϕ\phi.
  • Feature vectors may encode categorical, Boolean, and real-valued attributes, often with one-hot and numerical encoding, and context-specific features.

2. Model Architectures

Feed-forward Neural Preference-Perceptron

This instantiation generalizes logistic regression with an MLP for user-item pairs:

  • Input: Feature vector x(i)x^{(i)} for item ii (optionally user features).
  • Network: For LL layers, activations are recursively defined:

a(1)=x,z()=W(1)a(1)+b(1),a()=g(z()),    =2,,La^{(1)} = x,\qquad z^{(\ell)} = W^{(\ell-1)} a^{(\ell-1)} + b^{(\ell-1)},\qquad a^{(\ell)} = g(z^{(\ell)}), \;\; \ell = 2,\ldots,L

where gg is a nonlinearity (ReLU, sigmoid).

  • Output: Scalar preference score passed through a sigmoid (binary) or linear (regression) head:

f(x;W,b)=a(L)f(x; W, b) = a^{(L)}

(Chakraborty, 2024)

Online Linear Preference-Perceptron (Choice Perceptron)

For set-wise queries over combinatorial spaces:

  • At round tt with current weights wt\mathbf{w}^t, after observing user choice yˉt\bar{y}^t from Qt\mathcal{Q}^t, update:

wt+1=wt+ηΔt\mathbf{w}^{t+1} = \mathbf{w}^t + \eta \Delta^t

where

Δt=ϕ(xt,yˉt)1k1yQt,yyˉtϕ(xt,y)\Delta^t = \phi(x^t, \bar{y}^t) - \frac{1}{k-1} \sum_{y \in \mathcal{Q}^t, y \neq \bar{y}^t} \phi(x^t, y)

Intuitively, this increases the utility of the chosen configuration, penalizing the average of rejected ones (Dragone et al., 2017).

  • Estimated utility at each iteration: ut(x,y)=wt,ϕ(x,y)u^t(x, y) = \langle \mathbf{w}^t, \phi(x, y) \rangle.

3. Mathematical Formulation and Optimization

Preference-Perceptron algorithms employ several key optimization and learning rules:

  • Feature Normalization and Scaling

μk=1mi=1mxk(i),rangek=maxixk(i)minixk(i)\mu_k = \frac{1}{m}\sum_{i=1}^m x^{(i)}_k, \quad \text{range}_k = \max_i x^{(i)}_k - \min_i x^{(i)}_k

x~k(i)=xk(i)μkmax(rangek,ε)\tilde{x}^{(i)}_k = \frac{x^{(i)}_k - \mu_k}{\max(\text{range}_k, \varepsilon)}

  • Loss Functions

    • Cross-entropy for binary:

    L(W,b;x,y)=ylogf(x)(1y)log(1f(x))+λ2=1L1W()F2L(W, b; x, y) = -y \log f(x) - (1-y) \log(1-f(x)) + \frac{\lambda}{2}\sum_{\ell=1}^{L-1} \|W^{(\ell)}\|_F^2 - Squared error for real-valued:

    L(W,b;x,y)=12(f(x)y)2+λ2=1L1W()F2L(W, b; x, y) = \frac{1}{2}(f(x)-y)^2 + \frac{\lambda}{2}\sum_{\ell=1}^{L-1} \|W^{(\ell)}\|_F^2

(Chakraborty, 2024)

  • Gradient-Based Updates

W()W()ηW()L(W,b;x,y)W^{(\ell)} \leftarrow W^{(\ell)} - \eta \nabla_{W^{(\ell)}} L(W, b; x, y)

with batch or mini-batch variants.

  • Query-Selection via MILP (Combinatorial Setting)

    • At each round, maximize a weighted sum of feature diversity and estimated utility within Qt\mathcal{Q}^t:

    max{y1,,yk}  γi=2kϕ(xt,y1)ϕ(xt,yi)1+(1γ)i=2kut(xt,yi)\max_{\{y_1,\ldots,y_k\}} \; \gamma \sum_{i=2}^k \|\phi(x^t, y_1)-\phi(x^t, y_i)\|_1 + (1-\gamma)\sum_{i=2}^k u^t(x^t, y_i)

    subject to distinctness and optimality constraints (Dragone et al., 2017).

4. Learning Dynamics and Theoretical Guarantees

Preference-Perceptron algorithms admit rigorous regret analysis under reasonable user models and query strategies in set-wise feedback settings.

  • User Model: The choice probabilities are non-decreasing in true utility.
  • Query Informativeness (α\alpha), Affirmativeness (β\beta): Definitions guarantee that, in expectation, learning steps are sufficiently informative and update direction is controlled.
  • Regret Bound:

E[regT]2β/η+4R2wαT+O(1/T)\mathbb{E}[\mathrm{reg}^T] \leq \frac{\sqrt{2\beta/\eta + 4R^2}\|\mathbf{w}^*\|}{\alpha\sqrt{T}} + O(1/T)

This yields O(1/T)O(1/\sqrt{T}) convergence in average regret for combinatorial constructive preference elicitation (Dragone et al., 2017).

5. Training and Practical Deployment

  • Stochastic Optimization: Both single-sample and mini-batch gradient updates are used; mini-batch sizes of $10$–$100$ are effective (Chakraborty, 2024).
  • Hyperparameters: Initial learning rate η\eta, regularization λ\lambda, mini-batch size, network depth (MLP), activation functions, and early stopping criteria form the critical hyperparameter set.
  • Convergence Monitoring: Early stopping via validation loss plateau, iteration limits, or small weight updates are commonly used.
  • Implementation Considerations: Query selection in constructive settings requires MILP solvers for practical, efficient computation of diverse and high-utility candidate sets (Dragone et al., 2017).

6. Applications, Empirical Findings, and Scalability

Preference-Perceptron strategies have been empirically validated in several domains:

  • Synthetic Testbeds: On high-dimensional Boolean and hybrid feature domains, Preference-Perceptron outperforms or matches regret of Bayesian EVPI and max-margin (SetMargin) approaches while being one to two orders of magnitude faster for set queries (k=2k=2–4).
  • PC Configuration: Handles seven categorical and one numerical attribute with compatibility constraints, achieving lower regret and ≈5× faster evaluation than SetMargin.
  • Trip-Planning: With up to 127 features, the method remains scalable where alternatives are infeasible due to feature blow-up or MILP timeouts for very large kk (Dragone et al., 2017).

In collaborative filtering, the MLP Preference-Perceptron augments classical logistic regression by learning more complex, nonlinear boundaries, leveraging backpropagation and advanced optimization (feature scaling, normalization, regularization, learning rate decay, gradient checking) to fit user preference data (Chakraborty, 2024).

  • Classical Perceptron vs. Preference-Perceptron: The latter is specialized for preference-based, partial-information feedback (binary choices, real-valued ratings, setwise selection) and adapts through either backpropagation in neural nets or linear updates from setwise choice.
  • Interpretation: The Preference-Perceptron label applies because the perceptron or MLP receives direct preference (binary or scalar) signals as targets, fitting weights accordingly by perceptron-style or backpropagation learning rules (Chakraborty, 2024, Dragone et al., 2017).
  • Collaborative Filtering and Constructive Elicitation: Choice Perceptron uniquely supports constructive (synthesized) object spaces and hybrid features, and it provides formal learning guarantees and state-of-the-art empirical performance.

The Preference-Perceptron thus subsumes a family of algorithms instrumental both for collaborative filtering in recommender systems and interactive preference elicitation over hybrid combinatorial spaces, furnishing scalable, theoretically grounded, and empirically validated frameworks for preference modeling in complex domains.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Preference-Perceptron.