Skip to content

mongodb-labs/py-tpcc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TPC-C in Python for MongoDB

Approved in July of 1992, TPC Benchmark C is an on-line transaction processing (OLTP) benchmark. TPC-C is more complex than previous OLTP benchmarks such as TPC-A because of its multiple transaction types, more complex database and overall execution structure. TPC-C involves a mix of five concurrent transactions of different types and complexity either executed on-line or queued for deferred execution. The database is comprised of nine types of tables with a wide range of record and population sizes. TPC-C is measured in transactions per minute (tpmC). While the benchmark portrays the activity of a wholesale supplier, TPC-C is not limited to the activity of any particular business segment, but, rather represents any industry that must manage, sell, or distribute a product or service.

To learn more about TPC-C, please see the TPC-C documentation.

This repo is an experimental variant of Python TPC-C implementation based on the original here.

The structure of the repo is:

  1. pytpcc - the code for pytpcc with driver (DB) specific code in drivers subdirectory.
  2. vldb2019 - 2019 VLDB paper, poster and results generated from this code

All the tests were run using MongoDB Atlas. Use code VLDB2019 to get $150 credit to get started with MongoDB Atlas.

Sharded MongoDB Driver

  1. Create ana activate a python env.
mkdir ~/python_envs
cd ~/python_envs
~/python_envs$ python -m venv py-tpcc-env
source ~/python_envs/py-tpcc-env/bin/activate
  1. Install pymongo
pip install pymongo 
  1. Print your config.
cd ~/py-tpcc/pytpcc
~/py-tpcc/pytpcc$ python ./tpcc.py --print-config mongodb > mongodb.config
  1. Edit the configuration for Postgres in the mongodb.config.
    • Change shards to the number of shards
    • Change the mongodb connection uri string
    • Change the database name
# MongodbDriver Configuration File
# Created 2025-10-08 14:18:24.378446
[mongodb]

# The mongodb connection string or URI
uri                  = mongodb://user:[email protected]:27017/admin?ssl=true&tlsAllowInvalidHostnames=true&tlsAllowInvalidCertificates=true

# Database name
name                 = tpcc

# If true, data will be denormalized using MongoDB schema design best practices
denormalize          = True

# If true, transactions will not be used (benchmarking only)
notransactions       =

# If true, all things to update will be fetched via findAndModify
findandmodify        = True

# If true, aggregation queries will be used
agg                  =

# If true, we will allow secondary reads
secondary_reads      = True

# If true, we will enable retryable writes
retry_writes         = True

# If true, we will perform causal reads
causal_consistency   = True

# If true, we will have use only one 'unsharded' items collection
no_global_items      =

# If > 0 then sharded
shards               = 3
  1. Run pytpcc using --warehouses=XXX

    • Reset the database and load the data
    python ./tpcc.py --reset --no-execute --clients=100 --duration=10 --warehouses=21 --config=mongodb.config mongodb --stop-on-error
    • Only load the data
    python ./tpcc.py --no-execute --clients=100 --duration=10 --warehouses=21 --config=mongodb.config mongodb --stop-on-error
    • Execute the tests without loading data.
    python ./tpcc.py --no-load --clients=100 --duration=10 --warehouses=21 --config=mongodb.config mongodb --stop-on-error
    • Execute the tests with loading
    python ./tpcc.py --clients=100 --duration=10 --warehouses=21 --config=mongodb.config mongodb --stop-on-error

Postgres JSONB Driver

This branch contains a Postgres JSONB Driver.

Steps to run the PostgreSQL JSONB Driver

  1. Start Postgres.
sudo systemctl start postgresql
  1. Create ana activate a python env.
mkdir ~/python_envs
cd ~/python_envs
~/python_envs$ python -m venv py-tpcc-env
source ~/python_envs/py-tpcc-env/bin/activate
  1. Print your config.
cd ~/py-tpcc/pytpcc
~/py-tpcc/pytpcc$ python ./tpcc.py --print-config postgresqljsonb > postgresqljsonb.config
  1. Edit the configuraiton for Postgres in the postgresqljsonb.config. Add a password.
# PostgresqljsonbDriver Configuration File
# Created 2025-03-18 23:00:45.340852
[postgresqljsonb]

# The name of the PostgreSQL database
database             = tpcc

# The host address of the PostgreSQL server
host                 = localhost

# The port number of the PostgreSQL server
port                 = 5432

# The username to connect to the PostgreSQL database
user                 = postgres

# The password to connect to the PostgreSQL database
password             = <ADD_PASSWORD_HERE>
  1. Run the PostgreSQL JSONB driver tests with resetting the database.
~/py-tpcc/pytpcc$ python ./tpcc.py --reset --clients=1 --duration=1 --warehouses=1 --ddl tpcc_jsonb.sql --config=postgresqljsonb.config postgresqljsonb --stop-on-error
  1. Run the PostgreSQL JSONB driver tests with no load phase to use the data that is already loaded in the Postgres database.
~/py-tpcc/pytpcc$ python ./tpcc.py --no-load --clients=1 --duration=1 --warehouses=1 --ddl tpcc_jsonb.sql --config=postgresqljsonb.config postgresqljsonb --stop-on-error
  1. If you need to connect to Postgres and check the database size
psql -U postgres # and type the password
postgres=# \l+

# For any SQL command first use the database
\c tpcc;

About

MongoDB Adaptation of PyTPCC

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 19