Hybrid Learning for Cold-Start-Aware Microservice Scheduling in Dynamic Edge Environments

This repository contains a simplified implementation of the algorithm proposed in our paper:

"Hybrid Learning for Cold-Start-Aware Microservice Scheduling in Dynamic Edge Environments"

The method is based on GRU-enhanced Soft Actor-Critic (SAC) and features a two-stage training pipeline:

Offline Imitation Learning using expert trajectory data.
Online Reinforcement Learning using the SAC algorithm.

📦 Dependencies

Install required packages via pip or conda:

pip install torch numpy

Python built-in packages used:

random
csv
pickle
argparse
datetime

⚙️ Configuration

You can set the environment parameters in:

config.py

Key configurable items:

EDGE_NODE_NUM: Number of edge nodes
max_tasks, min_tasks: Number of microservice tasks per time slot
node_cpu_freq_max: Maximum CPU frequency of nodes
epoch_imitation: Number of imitation learning epochs
epoch: Number of reinforcement learning episodes
e1, e2: Weight coefficients for energy and delay in expert policy

🚀 Usage

1. Generate Expert Data (Optional)

If you haven’t generated expert trajectories yet, run:

python offline_data_collection.py

This will create an offline data file like:

offline_data_seq_n15_cpu650_task20-5_ep10.pkl

2. Train the GRU-SAC Model

Run the full training pipeline (both imitation + SAC):

python gru_sac_behavior_clone.py

The training consists of two phases:

📍 Phase 1: Imitation Learning (Behavior Cloning)

Trains the GRU-based actor to mimic expert policy
Uses offline trajectory data saved in .pkl files

📍 Phase 2: Online Reinforcement Learning (SAC)

Trains actor and critic using online interaction with the environment
Learns to balance scheduling delay and energy consumption

📁 Output

Trained actor model from imitation learning is saved as:

imitation_gru_sac_ep{epoch_imitation}_{timestamp}.pth

Training logs (CSV format) are saved under log/, e.g.:

log/gruBC_SAC_ep10_alpha1_n15_cpu650_task20-5_20250610.csv

Each line records:

Episode, Reward, Total Time, Total Energy, Completion Ratio, Download Time, Actor Loss, Critic Loss, Fail Times

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.idea		.idea
__pycache__		__pycache__
behavior_clone		behavior_clone
log		log
.DS_Store		.DS_Store
README.md		README.md
config.py		config.py
dqn_fix.py		dqn_fix.py
dqn_gru_fix.py		dqn_gru_fix.py
env_fix.py		env_fix.py
ppo_fc_fix.py		ppo_fc_fix.py
ppo_gru_fix.py		ppo_gru_fix.py
sac_allgru_fix.py		sac_allgru_fix.py
sac_fc_fix.py		sac_fc_fix.py
sac_gru_fix.py		sac_gru_fix.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hybrid Learning for Cold-Start-Aware Microservice Scheduling in Dynamic Edge Environments

📦 Dependencies

⚙️ Configuration

🚀 Usage

1. Generate Expert Data (Optional)

2. Train the GRU-SAC Model

📍 Phase 1: Imitation Learning (Behavior Cloning)

📍 Phase 2: Online Reinforcement Learning (SAC)

📁 Output

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Hybrid Learning for Cold-Start-Aware Microservice Scheduling in Dynamic Edge Environments

📦 Dependencies

⚙️ Configuration

🚀 Usage

1. Generate Expert Data (Optional)

2. Train the GRU-SAC Model

📍 Phase 1: Imitation Learning (Behavior Cloning)

📍 Phase 2: Online Reinforcement Learning (SAC)

📁 Output

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages