Generative design is the technology that generates a large number of structurally optimal designs in parallel by diversifying the problem definition of topology optimization. Recently, AI-based design automation technology that combines generative design and deep learning is gaining much attention. When generating a large number of designs, one of important evaluation factors is diversity. In general, the problem definition is diversified through varying of force and boundary conditions, and the diversity of generated designs is influenced by such the parameter levels. This study proposes a reinforcement learning (RL) based generative design process to determine optimal parameter level values according to a given initial design. Actor Critic and Proximal Policy Optimization (PPO) are applied for the learning framework and the result shows that RL can increase the diversity of generated designs.