In this paper, we consider a multi-user MEC wireless network in which multiple mobile devices can associate and perform computation offloading via wireless channels to MEC servers attached to the base stations. The decision whether the computation task is executed locally at the user device or to be offloaded for MEC server execution should be adaptive to the time-varying network dynamics. Taking into account the dynamic of the environment, we propose a deep reinforcement learning (DRL) based approach to solve the formulated nonconvex problem of minimizing computation cost in terms of total delay. However, real-world networks tend to have a large number of users and MEC servers involving large numbers of different actions, where evaluating the combination of every possible action becomes impractical. Therefore, conventional DRL methods may be difficult or even impossible to directly apply to the proposed model. Based on the recursive decomposition of the action space available to each state, we propose a DRL-based algorithm for joint server selection, cooperative offloading, and handover in a multi-access edge wireless network. Numerical results show that the proposed DRL based algorithm significantly outperforms the traditional Q-learning method and local computation in terms of task success rate and total delay.