Abstract
We consider price competition among multiple sellers over a selling horizon
of $T$ periods. In each period, sellers simultaneously offer their prices and
subsequently observe their respective demand that is unobservable to
competitors. The demand function for each seller depends on all sellers' prices
through a private, unknown, and nonlinear relationship. To address this
challenge, we propose a semi-parametric least-squares estimation of the
nonlinear mean function, which does not require sellers to communicate demand
information. We show that when all sellers employ our policy, their prices
converge at a rate of $O(T^{-1/7})$ to the Nash equilibrium prices that sellers
would reach if they were fully informed. Each seller incurs a regret of
$O(T^{5/7})$ relative to a dynamic benchmark policy. A theoretical contribution
of our work is proving the existence of equilibrium under shape-constrained
demand functions via the concept of $s$-concavity and establishing regret
bounds of our proposed policy. Technically, we also establish new concentration
results for the least squares estimator under shape constraints. Our findings
offer significant insights into dynamic competition-aware pricing and
contribute to the broader study of non-parametric learning in strategic
decision-making.