Skip to content

Commit 2acfca0

Browse files
committed
Refactor codebase to modular architecture with multi-format output
- Migrate from structopt to clap v4 for argument parsing - Add JSON and CSV output formats alongside text - Introduce trait-based GPU abstraction (GpuDevice, GpuManager) - Create comprehensive GpuMetrics struct with all NVML metrics - Add structured error handling with GpuError enum - Update dependencies (nvml-wrapper 0.10, sysinfo 0.32, serde) - Expose library crate for programmatic use - Bump version to 0.2.0 and edition to 2021
1 parent 9478263 commit 2acfca0

15 files changed

Lines changed: 1255 additions & 304 deletions

File tree

Cargo.toml

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,33 @@
11
[package]
22
name = "gpuinfo"
3-
version = "0.1.3"
3+
version = "0.2.0"
44
authors = ["Edward Hu <bodunhu@utexas.edu>"]
5-
edition = "2018"
5+
edition = "2021"
66
license = "MIT"
77
description = "A minimal command-line utility for querying GPU status"
88
homepage = "https://github.com/BDHU/gpuinfo"
99
repository = "https://github.com/BDHU/gpuinfo"
1010
readme = "README.md"
1111
keywords = ["rust", "gpu", "nvidia-smi", "command-line", "monitoring"]
12-
categories = ["command-line-utilities", "gpu"]
12+
categories = ["command-line-utilities"]
1313
exclude = ["*.png", "*.jpg"]
1414

1515
[[bin]]
1616
name = "gpu-info"
1717
path = "src/main.rs"
1818

19-
20-
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
19+
[lib]
20+
name = "gpuinfo"
21+
path = "src/lib.rs"
2122

2223
[dependencies]
23-
structopt = "^0.3"
24-
nvml-wrapper = "^0.7.0"
25-
nvml-wrapper-sys = "^0.5.0"
26-
sysinfo = "^0.19.0"
24+
clap = { version = "4", features = ["derive"] }
25+
nvml-wrapper = "0.10"
26+
sysinfo = "0.32"
27+
serde = { version = "1.0", features = ["derive"] }
28+
serde_json = "1.0"
29+
atty = "0.2"
2730

2831
[profile.release]
2932
opt-level = "s"
33+
lto = true

README.md

Lines changed: 41 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -9,22 +9,38 @@ A small command-line tool used to query and monitor GPU status.
99

1010
![gpuinfo-screenshot](gpuinfo.png)
1111

12-
NOTE: We only support NVIDIA GPU currently, AMD GPU is not yet supported. All contributions are welcome! This is an ongoing project and there might be changes in the future. The tool is tested on Linux. It might also work on macOS and Windows with some features missing.
12+
NOTE: We only support NVIDIA GPUs currently. All contributions are welcome! This is an ongoing project and there might be changes in the future. The tool is tested on Linux. It might also work on macOS and Windows with some features missing.
1313

1414
Usage
1515
-----
1616

17-
1817
```bash
1918
$ gpu-info
2019
```
2120

22-
Options:
21+
Common options:
2322

2423
* `-w`, `--watch`: Prints GPU information to terminal every second
25-
* `-i`, `--interval <interval>`: Prints GPU information to terminal according to given interval (integer seconds)
24+
* `-i`, `--interval <seconds>`: Prints GPU information to terminal at the given interval
25+
* `-f`, `--format <text|json|csv>`: Output format (default: text)
26+
* `--no-color`: Disable ANSI colors in text output
27+
* `--no-header`: Disable CSV header row
28+
* `-g`, `--gpu <index>`: Select a specific GPU by index
29+
* `-v`, `--verbose`: Verbose output (text)
30+
* `-q`, `--quiet`: Minimal output (text)
31+
32+
Examples:
33+
34+
```bash
35+
# Watch with default text output
36+
gpu-info --watch
37+
38+
# Output JSON for tooling
39+
gpu-info --format json
2640

27-
NOTE: more options are to be added.
41+
# Output CSV without a header (useful for appending)
42+
gpu-info --format csv --no-header
43+
```
2844

2945
Installation
3046
------------
@@ -38,15 +54,27 @@ cargo install gpuinfo
3854
Output
3955
------
4056

41-
> [0]: Tesla P100-SXM2-16GB | 60 | 0 % | 1544 / 16280 MB | 37°C | No running processes found
57+
Text output (default):
58+
59+
> [0] RTX 4090 | SM8.9 | 12% | 2048/24576MB | 45C | Fan:30% | 120/450W | 2100MHz/10500MHz | Gen4x16 | python:512MB
60+
61+
* `[0]`: GPU index
62+
* `RTX 4090`: GPU name
63+
* `SM8.9`: Compute capability
64+
* `12%`: GPU utilization
65+
* `2048/24576MB`: Memory used/total
66+
* `45C`: Temperature
67+
* `Fan:30%`: Fan speed (if available)
68+
* `120/450W`: Power usage/limit (if available)
69+
* `2100MHz/10500MHz`: Graphics/memory clocks (if available)
70+
* `Gen4x16`: PCIe link info (if available)
71+
* `python:512MB`: Running processes and their GPU memory use
4272

43-
* `[0]`: PCI_BUS_ID of the GPU. Beware that CUDA might assign different device ID. Ensure `CUDA_DEVICE_ORDER` is assigned `PCI_BUS_ID` will guarantee both `gpu-info` and CUDA yield the same result
44-
* `Tesla P100-SXM2-16GB`: Name of the GPU
45-
* `60`: Major and minor number of the GPU
46-
* `0 %`: Current GPU utilization rate
47-
* `1544 / 16280 MB`: GPU device memory usage
48-
* `37°C`: GPU temperature
49-
* `No running processes found`: Currently running processes on the GPU (note: only processes the user have privilege to access are shown).
73+
CSV output (with header by default):
74+
75+
```text
76+
index,name,uuid,gpu_util%,mem_util%,mem_used_mb,mem_total_mb,temp_c,power_w,power_limit_w,clock_graphics_mhz,clock_memory_mhz,fan%,pcie_gen,pcie_width,process_count
77+
```
5078

5179
License
5280
-------

src/argparse.rs

Lines changed: 70 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,77 @@
1-
use structopt::StructOpt;
1+
use clap::{Parser, ValueEnum};
22

3-
#[derive(Debug, StructOpt)]
4-
#[structopt(name = "gpu-info", about = "A list of command line flags.")]
3+
/// Output format options
4+
#[derive(Debug, Clone, Copy, ValueEnum, Default)]
5+
pub enum OutputFormat {
6+
/// Human-readable text with optional colors
7+
#[default]
8+
Text,
9+
/// JSON format for programmatic consumption
10+
Json,
11+
/// CSV format for spreadsheets and data analysis
12+
Csv,
13+
}
14+
15+
/// A minimal command-line utility for querying GPU status
16+
#[derive(Debug, Parser)]
17+
#[command(
18+
name = "gpu-info",
19+
about = "A command-line GPU monitoring utility",
20+
version,
21+
author
22+
)]
523
pub struct Opt {
6-
/// Prints GPU information to terminal every second
7-
#[structopt(short = "w", long = "watch", about = "watch over updated metrics.")]
24+
/// Watch mode: refresh every second
25+
#[arg(short = 'w', long = "watch", conflicts_with = "interval")]
826
pub watch: bool,
927

10-
/// Prints GPU information to terminal according to given interval (seconds)
11-
#[structopt(short = "i", long = "interval", about = "updated metrics based on given interval.")]
28+
/// Refresh interval in seconds
29+
#[arg(
30+
short = 'i',
31+
long = "interval",
32+
value_name = "SECONDS",
33+
conflicts_with = "watch"
34+
)]
1235
pub interval: Option<u64>,
36+
37+
/// Output format (text, json, csv)
38+
#[arg(short = 'f', long = "format", value_enum, default_value = "text")]
39+
pub format: OutputFormat,
40+
41+
/// Disable colored output
42+
#[arg(long = "no-color")]
43+
pub no_color: bool,
44+
45+
/// Disable CSV header row
46+
#[arg(long = "no-header")]
47+
pub no_header: bool,
48+
49+
/// Select specific GPU by index
50+
#[arg(short = 'g', long = "gpu", value_name = "INDEX")]
51+
pub gpu_index: Option<u32>,
52+
53+
/// Show verbose output with all available metrics
54+
#[arg(short = 'v', long = "verbose")]
55+
pub verbose: bool,
56+
57+
/// Quiet mode: minimal output
58+
#[arg(short = 'q', long = "quiet", conflicts_with = "verbose")]
59+
pub quiet: bool,
1360
}
1461

15-
pub fn argparse() -> Opt {
16-
let opt = Opt::from_args();
17-
return opt;
18-
}
62+
impl Opt {
63+
/// Parse command line arguments
64+
pub fn parse_args() -> Self {
65+
Opt::parse()
66+
}
67+
}
68+
69+
impl From<OutputFormat> for crate::output::OutputFormat {
70+
fn from(value: OutputFormat) -> Self {
71+
match value {
72+
OutputFormat::Text => crate::output::OutputFormat::Text,
73+
OutputFormat::Json => crate::output::OutputFormat::Json,
74+
OutputFormat::Csv => crate::output::OutputFormat::Csv,
75+
}
76+
}
77+
}

src/device/mod.rs

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
pub mod nvidia;
2+
3+
use crate::error::GpuError;
4+
use crate::metrics::GpuMetrics;
5+
6+
/// Trait for GPU device implementations
7+
///
8+
/// This trait provides a vendor-agnostic interface for querying GPU metrics.
9+
/// Each GPU vendor (NVIDIA, AMD, Intel) would implement this trait.
10+
pub trait GpuDevice: Send + Sync {
11+
/// Collect all available metrics from the device
12+
///
13+
/// Metrics that are unavailable will be set to None.
14+
fn collect_metrics(&self) -> Result<GpuMetrics, GpuError>;
15+
16+
/// Get the device index (0, 1, 2, ...)
17+
fn index(&self) -> u32;
18+
19+
/// Get the vendor name (e.g., "NVIDIA", "AMD", "Intel")
20+
fn vendor(&self) -> &'static str;
21+
}
22+
23+
/// Trait for GPU manager implementations
24+
///
25+
/// A manager handles initialization and enumeration of GPU devices
26+
/// for a specific vendor.
27+
pub trait GpuManager: Send + Sync {
28+
/// Initialize the GPU manager
29+
///
30+
/// This typically loads the vendor's management library.
31+
fn init() -> Result<Self, GpuError>
32+
where
33+
Self: Sized;
34+
35+
/// Get the number of GPU devices available
36+
fn device_count(&self) -> Result<u32, GpuError>;
37+
38+
/// Get a device by its index
39+
fn device(&self, index: u32) -> Result<Box<dyn GpuDevice + '_>, GpuError>;
40+
41+
/// Collect metrics from all devices
42+
fn collect_all_metrics(&self) -> Result<Vec<GpuMetrics>, GpuError> {
43+
let count = self.device_count()?;
44+
let mut metrics = Vec::with_capacity(count as usize);
45+
for i in 0..count {
46+
let device = self.device(i)?;
47+
metrics.push(device.collect_metrics()?);
48+
}
49+
Ok(metrics)
50+
}
51+
52+
/// Get the driver version
53+
fn driver_version(&self) -> Result<String, GpuError>;
54+
55+
/// Get the CUDA version (for NVIDIA) or equivalent
56+
fn cuda_version(&self) -> Result<String, GpuError>;
57+
}

0 commit comments

Comments
 (0)