Skip to content

aws-samples/sample-code-for-macroservice-extraction

Repository files navigation

Macro Service Extraction from a Modernized Mainframe Application

This repository provides a Python-based tool and sample code for extracting use-case-based macro services from monolithic Java applications produced by AWS Transform Refactor. Given a list of programs for a use case, the tool automatically resolves dependencies and generates a standalone, deployable Java project containing only the required programs.

Input: Large Java codebase (AWS Transform refactored application) and CSV list of programs to extract

Output: Smaller, focused Maven project containing only the programs needed for a specific use case, with minimal dependencies

Key Features

  • Dependency Analysis: Analyzes Java import statements and program dependencies
  • Program Identification: Extracts program identifiers from the Java codebase
  • Multiple Split Strategies: Supports symbolic link copies (split_copy) and full copies (full_split_copy)
  • DAO Management: Handles Data Access Object dependencies automatically
  • Web Component Support: Optional web module splitting
  • Groovy Support: Analyzes Groovy script dependencies
  • Configuration-Driven: YAML-based configuration system

Repository Structure

├── macroservice-extractor/    # Python extraction tool (source code, configs, workspace)
├── modernized-card-demo/      # Sample modernized Java codebase (CardDemo application)
├── card-demo-macroservice/    # Example extraction output

Prerequisites

  • Python 3.10 or higher
  • Java 21 (Amazon Corretto recommended)
  • Apache Maven 3.6+
  • AWS Transform refactored application (Gapwalk framework)
  • Dependency data AWS Transform

Quick Start

  1. Install Python dependencies:

    pip install -r macroservice-extractor/Java_Code_Spliter/requirements.txt
  2. Configure the project settings:

    • Edit macroservice-extractor/ressources/project_config.yml with your project paths
    • Edit macroservice-extractor/command.yml with extraction parameters
  3. Create a programs CSV listing the programs for your use case (one per line) in macroservice-extractor/Workspace/CSV/

  4. Run the extraction:

    python macroservice-extractor/Java_Code_Spliter/src/MainCommandLine.py -s macroservice-extractor/command.yml
  5. Verify the output in the extraction path specified in command.yml

Configuration

The tool requires two separate YAML configuration files to operate:

1. Project Configuration (macroservice-extractor/ressources/project_config.yml)

This file contains project-specific settings that describe your Java codebase structure and dependencies.

workspace_path: /path/to/macroservice-extractor/Workspace
dependencies_bluage_path: /path/to/dependencies.csv
project_name: app
project_path: /path/to/modernized-java-project
project_library_name: com/company/app
groovy_path: /path/to/groovy/scripts
sub_project_list:
  - entities
  - service
runtime_included:
  - service:
    - program/utils/**.*
    - servlet/**.*

Parameters Explained

  • workspace_path: Directory where the tool stores cached analysis files (pickle files)

  • dependencies_bluage_path: Path to the CSV file containing dependency information. The dependency data comes from the output artifacts of the AWS Transform analyze job (from the AWS Transform web application). Use jsonConverter.py to convert the JSON output to the CSV format expected by this tool.

  • project_name: Name of your Java project (typically the root module name, e.g., app for app-pom, app-service, app-entities)

  • project_path: Full path to your Java project codebase directory (the folder that contains [project_name]-pom/)

  • project_library_name: Java package structure using forward slashes

    • Example: com/company/app for package com.company.app
    • To verify: check any Java file's package declaration
  • groovy_path: Path to Groovy script files (if your project uses Groovy). Leave empty or omit if not using Groovy.

  • sub_project_list: List of Maven submodules the tool should scan for dependencies and extract code from

    • Common examples: entities, service
  • runtime_included: Runtime dependencies for each subproject that must always be included regardless of import analysis

    • Specifies which files/packages should be included at runtime for each module
    • Uses glob patterns (e.g., **.* for all files in subdirectories)
  • dao_ids_included (optional): List of Data Access Object identifiers to always include

2. Command Configuration (macroservice-extractor/command.yml)

This file contains the execution parameters that control how the splitter runs.

action_type: split_copy
project_config_path: /path/to/project_config.yml
extraction_path: /path/to/output/
split_csv_path: /path/to/programs.csv
to_split_web: false
to_delete: true
to_load_dependencies: true

Parameters Explained

  • action_type: The operation to perform

    • analyse — Scan all Java files and build the dependency graph. Run this first or whenever the source code changes. Results are cached for subsequent operations.
    • split_copy — Extract a macro service using symbolic links. Each Java file in the output links back to the source file, so code changes automatically reflect in the main codebase. Recommended for development.
    • full_split_copy — Extract a macro service using full file copies. No symbolic links — changes in the macro service do NOT reflect in the source. Use for production deployment.
  • project_config_path: Absolute path to the project configuration YAML file

  • extraction_path: Directory where the extracted macro service will be created

  • split_csv_path: Path to CSV file containing the list of programs to extract

  • to_split_web: Boolean flag to include web components in the extraction

    • true: Include the Angular web module (needed if the use case has frontend screens)
    • false: Exclude the web module (use for backend-only / batch use cases)
  • to_delete: Boolean flag to delete existing output folder before extraction

    • true: Delete and recreate output directory (recommended for clean runs)
    • false: Keep existing files (may cause conflicts)
  • to_load_dependencies: Boolean flag to load additional dependencies from the dependency CSV

    • true: Use the dependency CSV to resolve the full program call chain from entry points (recommended)
    • false: Only extract the exact programs listed in the CSV without following dependency chains

3. Programs CSV File

A simple CSV file listing the programs you want to extract, one program name per line:

COSGN00C
CORPT00C
COTRN00C

You can create this file within macroservice-extractor/Workspace/CSV/ folder (sample provided at macroservice-extractor/Workspace/CSV/cc00.csv).

How the Configuration Files Work Together

  1. You create both YAML files with your project settings
  2. The command.yml references the project_config.yml via the project_config_path parameter
  3. When you run the splitter, you pass the command.yml file
  4. The tool loads the project configuration from project_config.yml
  5. The splitter executes the specified action using both configurations

Command Line Usage

python macroservice-extractor/Java_Code_Spliter/src/MainCommandLine.py -s macroservice-extractor/command.yml

Data Preparation

Converting AWS Transform JSON to CSV

The dependency data comes from the output artifacts of the AWS Transform analyze job (available in the AWS Transform web application). Convert the JSON output to the CSV format expected by this tool:

python macroservice-extractor/jsonConverter.py <input>.json <output>.csv
  • <input>.json: Path to the AWS Transform JSON output file containing dependency information
  • <output>.csv: Path where the converted CSV file will be saved

The converted CSV file should then be referenced by the dependencies_bluage_path parameter in project_config.yml.

Architecture

Core Components

Data Classes (macroservice-extractor/Java_Code_Spliter/src/dataClass/)

  • Program.py: Represents individual programs with identifiers
  • Split.py: Contains split configuration and dependencies
  • ProjectConfig.py: Project-wide configuration settings
  • SplitConfig.py: Split-specific configuration
  • DependenciesAnalysis.py: Dependency analysis results

Splitters (macroservice-extractor/Java_Code_Spliter/src/Spliter/)

  • Spliter.py: Abstract base class for all splitters
  • SymbolicCopySpliter.py: Creates symbolic links for split projects
  • FullCopySpliter.py: Creates complete copies of split projects

Analysis Tools (macroservice-extractor/Java_Code_Spliter/src/)

  • DependenciesAnalyser.py: Main dependency analysis engine
  • ProgramId.py: Program identification and mapping
  • GrooviesAnalyser.py: Groovy file analysis
  • MainCommandLine.py: CLI entry point

Project Structure

macroservice-extractor/Java_Code_Spliter/src/
├── dataClass/          # Data models and configurations
├── Spliter/            # Core splitting logic
├── utils/              # Utility functions and helpers
├── DependenciesAnalyser.py  # Main analysis engine
├── GrooviesAnalyser.py      # Groovy analysis
├── MainCommandLine.py       # CLI entry point
└── ProgramId.py             # Program identification

How It Works

Dependency Analysis

  1. Import Analysis: Scans all Java files for import statements
  2. DAO Extraction: Identifies Data Access Object dependencies via FileIds references
  3. Cross-Reference Resolution: Resolves dependencies between modules
  4. Circular Dependency Detection: Identifies and handles circular references

Splitting Strategy

  1. Program Grouping: Groups related programs together based on the CSV input
  2. Dependency Resolution: For each program, resolves all transitive dependencies (imports, DAOs, runtime includes)
  3. Module Creation: Copies (or symlinks) only the required files into a new Maven project structure
  4. Configuration Generation: Generates Maven pom.xml and build configurations for the extracted module

Error Handling

The tool includes comprehensive error handling for:

  • Missing configuration files
  • Invalid CSV formats
  • Circular dependencies
  • File access issues
  • Module path problems

Logging

Configurable logging system with:

  • Debug mode (--debug) for detailed analysis output
  • Warning messages for potential issues (missing programs, unresolved dependencies)
  • Error reporting with context
  • Progress tracking for long operations
  • Log output written to macroservice-extractor/app.log

Cleanup

This project does not deploy any cloud resources. To clean up locally, delete the extraction output folder and any generated .pkl cache files in macroservice-extractor/Workspace/.

Security

See SECURITY.md for security policy and production hardening recommendations. Additionally, see CONTRIBUTING for more information.

License

This project is licensed under the MIT-0 License. See the LICENSE file.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors