M. Buddhikot, Understanding Dynamic Spectrum Allocation: Models, Taxonomy and Challenges, Proc. IEEE DySPAN'07, pp.649-663, 2007.
DOI : 10.1109/dyspan.2007.88

S. Buljore, IEEE P1900.4 Standard: Reconfiguration of multi-radio systems, 2008 IEEE Region 8 International Conference on Computational Technologies in Electrical and Electronics Engineering, pp.413-417, 2008.
DOI : 10.1109/SIBIRCON.2008.4602601

S. Filin, Dynamic Spectrum Assignment and Access Scenarios, System Architecture, Functional Architecture and Procedures for IEEE P1900, Proc. IEEE CrownCom'08, pp.1-7, 2008.

M. Buddhikot, P. Kolodzy, K. Ryan, J. Evans, and S. Miller, DIMSUMNet: New Directions in Wireless Networking Using Coordinated Dynamic Spectrum Access, Sixth IEEE International Symposium on a World of Wireless Mobile and Multimedia Networks, pp.78-85, 2005.
DOI : 10.1109/WOWMOM.2005.36

D. Thilakawardana, K. Moessner, and R. Tafazolli, Darwinian approach for dynamic spectrum allocation in next generation systems, IET Communications, vol.2, issue.6, pp.827-836, 2008.
DOI : 10.1049/iet-com:20070502

S. Sankaranarayanan, P. Papadimitratos, A. Mishra, and S. Hershey, A bandwidth sharing approach to improve licensed spectrum utilization, First IEEE International Symposium on New Frontiers in Dynamic Spectrum Access Networks, 2005. DySPAN 2005., pp.279-288, 2005.
DOI : 10.1109/DYSPAN.2005.1542644

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

J. M. Chapin and W. H. Lehr, COGNITIVE RADIOS FOR DYNAMIC SPECTRUM ACCESS - The Path to Market Success for Dynamic Spectrum Access Technology, IEEE Communications Magazine, vol.45, issue.5, pp.96-103, 2007.
DOI : 10.1109/MCOM.2007.358855

J. Acharya and R. D. Yates, A price based dynamic spectrum allocation scheme, 2007 Conference Record of the Forty-First Asilomar Conference on Signals, Systems and Computers, pp.797-801, 2007.
DOI : 10.1109/ACSSC.2007.4487326

L. Vanbien, L. Yuewei, W. Xiaomeng, F. Zhiyong, and Z. Ping, A Cell Based Dynamic Spectrum Management Scheme with Interference Mitigation for Cognitive Networks, Proc. VTC'08, pp.1594-1598, 2008.

S. Geirhofer, L. Tong, and B. M. Sadler, Cognitive Medium Access: A Protocol for Enhancing Coexistence in WLAN Bands, IEEE GLOBECOM 2007-2007 IEEE Global Telecommunications Conference, pp.3558-3562, 2007.
DOI : 10.1109/GLOCOM.2007.676

Q. Zhao, L. Tong, A. Swami, and Y. Chen, Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks: A POMDP framework, IEEE Journal on Selected Areas in Communications, vol.25, issue.3, pp.589-600, 2007.
DOI : 10.1109/JSAC.2007.070409

M. Coupechoux, J. M. Kelif, and P. Godlewski, SMDP approach for JRRM analysis in heterogeneous networks, 2008 14th European Wireless Conference, pp.1-7, 2008.
DOI : 10.1109/EW.2008.4623856

URL : https://hal.archives-ouvertes.fr/hal-01493342

N. Enderlé and X. Lagrange, User satisfaction models and scheduling algorithms for packet-switched services in UMTS, The 57th IEEE Semiannual Vehicular Technology Conference, 2003. VTC 2003-Spring., pp.1704-1709, 2003.
DOI : 10.1109/VETECS.2003.1207114

D. P. Bertsekas, Dynamic Programming and Optimal Control, 2007.

P. Tadepalli and D. Ok, Model-based average reward reinforcement learning, Artificial Intelligence, vol.100, issue.1-2, pp.177-224, 1998.
DOI : 10.1016/S0004-3702(98)00002-2

URL : http://doi.org/10.1016/s0004-3702(98)00002-2

P. Y. Glorennec, Reinforcement Learning: an Overview, European Symposium on Intelligent Techniques, pp.17-35, 2000.

J. Abounadi and D. Bertsekas, Learning Algorithms for Markov Decision Processes with Average Cost, SIAM Journal on Control and Optimization, vol.40, issue.3, pp.681-698, 2001.
DOI : 10.1137/S0363012999361974

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

C. J. Watkins, Learning from Delayed Rewards, 1989.

A. Gosavi, Reinforcement learning for long-run average cost European journal of operational research, Traffic and Transportation Systems Analysis, pp.654-674, 2004.