Zusammenfassung
Personalized pricing analytics is becoming an essential tool in retailing.
Upon observing the personalized information of each arriving customer, the firm
needs to set a price accordingly based on the covariates such as income,
education background, past purchasing history to extract more revenue. For new
entrants of the business, the lack of historical data may severely limit the
power and profitability of personalized pricing. We propose a nonparametric
pricing policy to simultaneously learn the preference of customers based on the
covariates and maximize the expected revenue over a finite horizon. The policy
does not depend on any prior assumptions on how the personalized information
affects consumers' preferences (such as linear models). It is adaptively splits
the covariate space into smaller bins (hyper-rectangles) and clusters customers
based on their covariates and preferences, offering similar prices for
customers who belong to the same cluster trading off granularity and accuracy.
We show that the algorithm achieves a regret of order $O(łog(T)^2
T^(2+d)/(4+d))$, where $T$ is the length of the horizon and $d$ is the
dimension of the covariate. It improves the current regret in the literature
slivkins2014contextual, under mild technical conditions in the pricing
context (smoothness and local concavity). We also prove that no policy can
achieve a regret less than $O(T^(2+d)/(4+d))$ for a particular instance and
thus demonstrate the near optimality of the proposed policy.
Nutzer