Skip to content

atenearesearchgroup/instance-generation-MODELS25

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Summary

  • GPT-4o (gpt-4o-2024-08-06).
  • 30 generations for each domain.
  • Context window size: 64000 tokens.
  • Max completion tokens: 16384 tokens.
  • Temperature: 1.0.
  • Top P: 1.0.

File structure

llm-instancer

Main repository for instances generation and semantic evaluation.

llm-evaluator

Main repository for diversity evaluation.

Inputs

Prompts

  • prompts.md All used prompts.
  • example.soil Syntax example (shot) of instances generation for the LLM.

Prompts / system

  • diagram.use Class diagram in USE of the domain model.
  • diagram.pdf Class diagram in PDF of the domain model.
  • diagram_default.clt Autogenerated config file by USE.

Outputs

Instances / Simple / system / experimentDate

  • logs.md File with input/output messages from the LLM and additional parameters (input/output tokens, total tokens, temperature, context window size, etc...).
  • metrics.md File with general metrics (syntax, multiplicities, invariant errors) and specific/semantic ones per system (e.g., Bank: valid IBANs, valid BICs, etc.). (Calculated for each generation and summary for all generations).

Instances / Simple / system / experimentDate / gen_i

  • output.soil Instance output for that generation.

Instances / CoT / system / experimentDate

  • logs.md File with input/output messages from the LLMs and additional parameters (input/output tokens, total tokens, temperature, context window size, etc.). NOTE: As execution is handled in parallel, be aware that input/output messages are not ordered by category but by execution.
  • metrics.md File with general metrics (syntax, multiplicities, invariant errors) and specific/semantic ones per system (e.g., Bank: valid IBANs, valid BICs, etc.). (Calculated for each category, for each generation, and a summary for all generations).

Instances / CoT / system / experimentDate / gen_i

  • category.soil Instance output for that category.
  • outputInvalid.soil Contains only the invalid category for that generation.
  • outputValid.soil Contains all categories except the invalid one for that generation.
  • output.soil Combined instance outputs for all categories for that generation.

Syntaxis & Conformance evaluation

Semantic evaluation

Address Book

  • Phone : RegEx : ^(\\+\\d{1,3}\\s?)?[0-9\\(\\)-.\\s]{6,15}$
  • Website : RegEx : ^(https?://)?([\\w-]+\\.)?[\\w-]+(\\.[a-z]{2,}(\\.[a-z]{2,})?)?(:\\d+)?(/[\\w-./?%&=]*)?$
  • Email : RegEx : ^[\\w!#$%&'*+/=?`{|}~\\^-]+(?:\\.[\\w!#$%&'*+/=?`{|}~\\^-]+)*@(?:[\\w-]+\\.)*[\\w-]+\\.[a-zA-Z]{2,}$
  • Address : API : https://www.geoapify.com/

Bank

Hotel Management

Expenses

Pickup Net

  • Address, Latitude, Longitude : API : https://www.geoapify.com/
  • X (Twitter) Username : RegEx : ^@?[a-zA-Z_][a-zA-Z0-9_]{3,14}$

Vehicle rental

  • Address : API : https://www.geoapify.com/
  • License plate : RegEx : ^[A-Z0-9][A-Z0-9\\s-]{1,9}[A-Z0-9]$
  • Home Phone : RegEx : ^(\\+\\d{1,3}\\s?)?[0-9\\(\\)-.\\s]{6,15}$

Videoclub

Football

Restaurant

Metrics

Diversity

  • simpleDifference.md : Semantic difference within and across generated intances for Simple approach.
  • cotDifference.md : Semantic difference within and across generated intances for CoT approach.
  • combinedDifference.md : Combined semantic difference within and across generated intances.
  • Semantic Diversity Results.xlsx : Summary file and calulations

Preliminary Experiments

  • Preliminary Experiments.xlsx : Summary file of preliminary experiments for selecting LLMs.

Total executed checks

  • Total executed checks per domain.xlsx : Summary file of total executed checks feedbacked to the LLMs.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors