- GPT-4o (gpt-4o-2024-08-06).
- 30 generations for each domain.
- Context window size: 64000 tokens.
- Max completion tokens: 16384 tokens.
- Temperature: 1.0.
- Top P: 1.0.
Main repository for instances generation and semantic evaluation.
Main repository for diversity evaluation.
- prompts.md All used prompts.
- example.soil Syntax example (shot) of instances generation for the LLM.
- diagram.use Class diagram in USE of the domain model.
- diagram.pdf Class diagram in PDF of the domain model.
- diagram_default.clt Autogenerated config file by USE.
- logs.md File with input/output messages from the LLM and additional parameters (input/output tokens, total tokens, temperature, context window size, etc...).
- metrics.md File with general metrics (syntax, multiplicities, invariant errors) and specific/semantic ones per system (e.g., Bank: valid IBANs, valid BICs, etc.). (Calculated for each generation and summary for all generations).
- output.soil Instance output for that generation.
- logs.md File with input/output messages from the LLMs and additional parameters (input/output tokens, total tokens, temperature, context window size, etc.). NOTE: As execution is handled in parallel, be aware that input/output messages are not ordered by category but by execution.
- metrics.md File with general metrics (syntax, multiplicities, invariant errors) and specific/semantic ones per system (e.g., Bank: valid IBANs, valid BICs, etc.). (Calculated for each category, for each generation, and a summary for all generations).
- category.soil Instance output for that category.
- outputInvalid.soil Contains only the invalid category for that generation.
- outputValid.soil Contains all categories except the invalid one for that generation.
- output.soil Combined instance outputs for all categories for that generation.
- UML-based Specification Environment Tool (USE) https://github.com/useocl/use
- Phone : RegEx :
^(\\+\\d{1,3}\\s?)?[0-9\\(\\)-.\\s]{6,15}$ - Website : RegEx :
^(https?://)?([\\w-]+\\.)?[\\w-]+(\\.[a-z]{2,}(\\.[a-z]{2,})?)?(:\\d+)?(/[\\w-./?%&=]*)?$ - Email : RegEx :
^[\\w!#$%&'*+/=?`{|}~\\^-]+(?:\\.[\\w!#$%&'*+/=?`{|}~\\^-]+)*@(?:[\\w-]+\\.)*[\\w-]+\\.[a-zA-Z]{2,}$ - Address : API : https://www.geoapify.com/
- IBAN, BIC : Library : https://gitlab.com/schegge-projects/bank-account-validator
- Realistic : RegEx :
([A-Z]{2})(\\d{2})([A-Z0-9]{11,30}) - Real : Checksum + RegEx
- Realistic : RegEx :
- Country : java.util.Locale : https://docs.oracle.com/javase/8/docs/api/java/util/Locale.html
- Dates : java.time.LocalDate & Comparator : https://docs.oracle.com/javase/8/docs/api/java/time/LocalDate.html
- Dates : java.time.LocalDate & Comparator : https://docs.oracle.com/javase/8/docs/api/java/time/LocalDate.html
- Address, Latitude, Longitude : API : https://www.geoapify.com/
- X (Twitter) Username : RegEx :
^@?[a-zA-Z_][a-zA-Z0-9_]{3,14}$
- Address : API : https://www.geoapify.com/
- License plate : RegEx :
^[A-Z0-9][A-Z0-9\\s-]{1,9}[A-Z0-9]$ - Home Phone : RegEx :
^(\\+\\d{1,3}\\s?)?[0-9\\(\\)-.\\s]{6,15}$
- Production title, genre, type, actors, release date : API : https://www.omdbapi.com/
- Dates : java.time.LocalDate & Comparator : https://docs.oracle.com/javase/8/docs/api/java/time/LocalDate.html
- Player names, Club names, Team names, Competition names : Manual
- Dates : java.time.LocalDate & Comparator : https://docs.oracle.com/javase/8/docs/api/java/time/LocalDate.html
- Phone : RegEx :
^(\\+\\d{1,3}\\s?)?[0-9\\(\\)-.\\s]{6,15}$ - Person names, Restaurant names, Driver licenses, Menu items, Food items : Manual
- simpleDifference.md : Semantic difference within and across generated intances for Simple approach.
- cotDifference.md : Semantic difference within and across generated intances for CoT approach.
- combinedDifference.md : Combined semantic difference within and across generated intances.
- Semantic Diversity Results.xlsx : Summary file and calulations
- Preliminary Experiments.xlsx : Summary file of preliminary experiments for selecting LLMs.
- Total executed checks per domain.xlsx : Summary file of total executed checks feedbacked to the LLMs.