Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1203 commits
Select commit Hold shift + click to select a range
9aad0d3
Set some default values for the reader params.
Jan 7, 2025
5316874
Created an init method for the ascii reader params
Jan 7, 2025
1b1f0d9
use post init instead.
Jan 7, 2025
99d66a8
Accidentally deleted columns param.
Jan 8, 2025
31a2fd2
Forgot to set the defalt value for this.
Jan 8, 2025
75cadf9
Fixed typo in todo comment.
Jan 8, 2025
b8136f2
This should be false by default.
Jan 8, 2025
6505bbf
Add a default value for metadata.
Jan 8, 2025
398bb2b
Use the new defaults to make this cleaner.
Jan 8, 2025
24cf3ae
Use a function for loading a directory of files.
Jan 8, 2025
2485663
This is probably better than default factory.
Jan 8, 2025
0af684e
Changed my mind again on initing the metadata.
Jan 8, 2025
46d2c69
Typo in post init method name.
Jan 8, 2025
44d6b71
Simplify the second test.
Jan 8, 2025
da554e3
Removed this TODO comment.
Jan 8, 2025
acc7c35
Fixed the trend axis for the last test.
Jan 8, 2025
f2babfe
Test the interpolation on the trend works.
Jan 8, 2025
93e9110
Fixed call to interpolate function.
Jan 9, 2025
1cf47a6
Sasdata contains a list of quantities.
jamescrake-merani Jan 14, 2025
794323c
Make these properties.
jamescrake-merani Jan 14, 2025
265b4a5
Missing self perameters.
jamescrake-merani Jan 14, 2025
31dac33
Store the dataset type inside SasData.
jamescrake-merani Jan 14, 2025
c0c2aca
Try using a dict with validation.
jamescrake-merani Jan 15, 2025
909e43a
Make this subscriptable.
jamescrake-merani Jan 15, 2025
c17e5dc
Changes in accordance with quantities changes.
jamescrake-merani Jan 15, 2025
0f366b4
Wrote some docstrings.
jamescrake-merani Jan 15, 2025
fbf8cf5
Add a docstring for the params class.
jamescrake-merani Jan 15, 2025
ed01994
Fixed line endings.
jamescrake-merani Jan 15, 2025
ebfbb7e
Remove todo comment.
jamescrake-merani Jan 15, 2025
8c65f54
Added docstrings.
jamescrake-merani Jan 15, 2025
3dc2764
Remove this.
jamescrake-merani Jan 15, 2025
1d6b810
Use full function name here.
jamescrake-merani Jan 15, 2025
56fbd76
Return a dictionary of quantities; not a list.
jamescrake-merani Jan 15, 2025
4018f58
Merge uncertainties returns a dict.
jamescrake-merani Jan 15, 2025
7c5cb34
Add a dataset type to the ascii reader params.
jamescrake-merani Jan 16, 2025
e367a9c
Updated the SasData construction call.
jamescrake-merani Jan 16, 2025
f9fa04e
Add with standard error for quantities.
jamescrake-merani Jan 17, 2025
a1c9693
Fixed type hinting.
jamescrake-merani Jan 17, 2025
8b8ed5f
Return a quantity not a named quantity.
jamescrake-merani Jan 17, 2025
f8b7fef
Leave the original Q axis in.
jamescrake-merani Jan 17, 2025
7430b52
Remove python interpreter header.
jamescrake-merani Mar 4, 2025
289c462
Remove this comment as I don't think its true.
jamescrake-merani Mar 4, 2025
6adacdd
Skeleton implementation for ordinate and abscissae
jamescrake-merani Mar 4, 2025
07db177
These are the wrong way round.
jamescrake-merani Mar 5, 2025
6e03e35
Match statements are causing problems; use if.
jamescrake-merani Mar 5, 2025
a233b32
Implement abscissae for 2D data.
jamescrake-merani Mar 5, 2025
5c60a84
Wrote a test for 1d data.
jamescrake-merani Mar 5, 2025
997fe38
Use the value property.
jamescrake-merani Mar 5, 2025
8376806
2D test.
jamescrake-merani Mar 5, 2025
f0f7d04
Bring guess into sasdata.
Feb 12, 2025
9653f5d
Bring default logic from SasView here.
Feb 12, 2025
105e33b
Make the default value a param.
Feb 12, 2025
8468fa4
Function for making the guesses.
Feb 12, 2025
1ac6855
Forgot return statement.
Feb 12, 2025
82a2ee5
Column unit can be none.
Feb 12, 2025
bf3da48
If there are extra columns than expected, ignore.
Feb 12, 2025
418da9c
Wrote a test for the ASCII reader.
Feb 12, 2025
36a76a1
Need to get the full filename.
Feb 12, 2025
1253acc
Don't use filename to get filename :P
Feb 12, 2025
682c7fb
Property that doesn't include ignore columns.
Feb 12, 2025
4f26560
It seems I'm a bit sleepy today.
Feb 12, 2025
0ea1cdd
Use the new property.
Feb 12, 2025
004cc66
Find can accept more locations in the future.
Feb 13, 2025
647b242
Missing import.
Feb 13, 2025
bcfb6e8
Second test.
Feb 13, 2025
dd15ff2
Uncertainties should have the same units.
Feb 17, 2025
6e989f8
Add slow cubic interpolation
rprospero Feb 3, 2025
44acb80
Fix first for loop
rprospero Feb 4, 2025
18f42e6
Remove second for loop from conversion_matrix calculation
rprospero Feb 4, 2025
e5e0f28
Parameterize tests over interpolation orders
rprospero Feb 17, 2025
75a6608
Create test option for displaying diagnostic plots
rprospero Feb 17, 2025
bacc68a
More interpolation test into main test directory
rprospero Feb 17, 2025
94cc406
Temporarily add a magnetic category.
jamescrake-merani Mar 4, 2025
7b04ce6
Fixed typo.
jamescrake-merani Mar 5, 2025
e53becb
Don't compare against the quantity.
jamescrake-merani Mar 5, 2025
f3e457a
Need to convert to list before array.
jamescrake-merani Mar 5, 2025
1f73917
Apply all twice.
jamescrake-merani Mar 5, 2025
821ceaa
Use brackets properly.
jamescrake-merani Mar 5, 2025
8dccd79
Add tests for dataset
rprospero Mar 4, 2025
d76e61b
Handle collimations directly
rprospero Mar 5, 2025
097863f
Start properly descending node tree
rprospero Mar 5, 2025
9adb716
Make aperture part of collimation
rprospero Mar 5, 2025
692a775
Move to dataclasses
rprospero Mar 17, 2025
cd91c64
Parse source
rprospero Mar 17, 2025
7cafe5e
Parse source in instrument
rprospero Mar 17, 2025
edb50c1
Start fixing Detector setup
rprospero Mar 17, 2025
6b8504b
Get Detector name and wavelength properly
rprospero Mar 17, 2025
99ef0a2
Better unit parsing
rprospero Mar 17, 2025
a54409d
All the detector parts
rprospero Mar 17, 2025
b8ea913
Fix up aperture
rprospero Mar 17, 2025
5917ad5
Instrument is a data class
rprospero Mar 17, 2025
84eacd4
Keyword only dataclasses
rprospero Mar 17, 2025
e91381e
Make the Instrument metadata optional
rprospero Mar 17, 2025
8a27a35
Parse sample data
rprospero Mar 17, 2025
e3c0931
Refactor our conditional parsing code
rprospero Mar 17, 2025
ad3702d
Process Metadata as a dataclass
rprospero Mar 17, 2025
89857b5
Simplify radiation handling
rprospero Mar 17, 2025
e179a8a
Include high leve metadata in metadata
rprospero Mar 17, 2025
0b6ad96
Metadata is a dataclass
rprospero Mar 17, 2025
b49b963
Remove unused values
rprospero Mar 17, 2025
14becaf
Properly load and test multiple entries
rprospero Mar 17, 2025
c8199ac
Minor cleanup
rprospero Mar 17, 2025
70c7b57
Flag node types for parses
rprospero Mar 17, 2025
1e38eb9
Fixup after rebase
rprospero Mar 17, 2025
d811c74
Clean up import lint
rprospero Mar 17, 2025
4e8d1e2
More lint
rprospero Mar 17, 2025
3650799
More handling of the change to dict
rprospero Mar 17, 2025
fb1f832
All metadata needs to be optional to accomodate other file formats
rprospero Mar 19, 2025
9bffdea
Fix up types
rprospero Mar 19, 2025
f569ebb
Collimation is optional
rprospero Mar 19, 2025
73a5d52
Fix imports
rprospero Mar 19, 2025
232e95a
Replace optional with proper sum type
rprospero Mar 19, 2025
ff6c047
Remove unused imports
rprospero Mar 19, 2025
8376b25
Fix parsing of unit full names
rprospero Mar 19, 2025
21af33d
Remove testing bit
rprospero Mar 19, 2025
769acfa
Don't edit the file with a giant "Don't edit this file" header
rprospero Mar 19, 2025
bb1801e
Handle null title and sample in summary
rprospero Apr 2, 2025
8468a2a
Start parsing XML
rprospero Apr 2, 2025
37c52dd
Create _load_text helper
rprospero Apr 2, 2025
e1bf63c
Parse process
rprospero Apr 2, 2025
622b192
Refactor with helper functions
rprospero Apr 2, 2025
1507fcc
Fully parse xml metadata
rprospero Apr 2, 2025
3f09869
Parse IData in XML
rprospero Apr 2, 2025
6b9d67c
Remove extraneous line in hdf reader
rprospero Apr 2, 2025
200c0c8
Trial multiple files
rprospero Apr 2, 2025
2c950a8
Fix aperture load
rprospero Apr 2, 2025
a049c01
Support multiple cansas versions
rprospero Apr 2, 2025
4b4f195
More IData column types
rprospero Apr 4, 2025
fb5b2b1
Start adding official tests for xml file loading
rprospero Apr 4, 2025
22b8056
Add singular alias to degrees unit
rprospero Apr 4, 2025
aca995c
Better work at handling comments
rprospero Apr 4, 2025
bddf195
Refactor text parsing to better handle comments
rprospero Apr 4, 2025
2b19d97
More tests of xml
rprospero Apr 4, 2025
c873839
Mark weird units, but continue parsing
rprospero Apr 4, 2025
9e6636f
Add last xml test
rprospero Apr 4, 2025
1166440
Fix parsing of BeamSize
rprospero Apr 4, 2025
4735381
Add micron unit alias
rprospero Apr 8, 2025
4a31fed
Handle singular unit names in parsing
rprospero Apr 8, 2025
04c2a58
Always print summary in same order
rprospero Apr 8, 2025
4f04412
Reinstate summary output when xml loader is used as a main module
rprospero Apr 14, 2025
0cb5af6
Factor our finding cansas version into its own file
rprospero Apr 14, 2025
7f13429
More docstrings
rprospero Apr 14, 2025
c0a3fb4
Fix typo in comment
rprospero Apr 14, 2025
11a444f
Fix type hints
rprospero Apr 14, 2025
47693ec
Provide default name for unnamed data sets
rprospero Apr 14, 2025
a2484af
Better default names for SasData entries
rprospero Apr 14, 2025
bd0d70b
Process can have multiple terms, which may be quantites
rprospero Apr 14, 2025
e48e1df
Fix hdf process term parsing
rprospero Apr 14, 2025
12723c5
Correctly parse process notes
rprospero Apr 14, 2025
4881def
Add type hint to load_data filename
rprospero Apr 14, 2025
938fb8b
More concise source parsing
rprospero Apr 24, 2025
98270d8
Whitespace fix
rprospero Apr 24, 2025
e853519
Typehint on load_data
rprospero Apr 24, 2025
d71c5cb
Fix up Source parser
rprospero Apr 24, 2025
de5a31c
Add raw data to xml metadata
rprospero Apr 15, 2025
446750a
Add raw handling to reader
rprospero Apr 15, 2025
0628b76
Enable raw data filter
rprospero Apr 15, 2025
f936d24
Start testing data filter
rprospero Apr 15, 2025
2cc0e65
Run through code formatter
rprospero Apr 15, 2025
0f0f443
Simplify metadata filtering
rprospero Apr 24, 2025
71948a5
Raise KeyError instead of ValueError
rprospero May 7, 2025
de470ad
Fix up ascii reader tests
rprospero May 8, 2025
56b0b50
Update creation of SasData in trend
rprospero May 8, 2025
667acdb
Properly compare named units with unnamed units
rprospero May 8, 2025
2b04abb
Enforce loading test reference files in UTF-8
rprospero May 8, 2025
93aaece
Skeleton framework for SESANS data
rprospero May 6, 2025
d9c43f8
Start parsing sesans header
rprospero May 6, 2025
ce9ddfb
Start parsing SESANS sample metadata
rprospero May 6, 2025
eb10aed
Include SESANS angle metadata
rprospero May 6, 2025
3786393
Multiple SESANS files in reader test
rprospero May 6, 2025
3d0496e
Parse actual data from SES files
rprospero May 6, 2025
b1b99c1
Add Raw SESANS Node Data
rprospero May 7, 2025
8451a8e
Update XML test references to include apertures
rprospero May 7, 2025
584c6c2
SESANS metadata as a process, not an aperture
rprospero May 7, 2025
1aa27cd
Fixup lint
rprospero May 7, 2025
72c51ad
Make changes suggested in PR review
rprospero May 19, 2025
4e428db
Fix simple typos from rebase
rprospero Jun 3, 2025
45cf7f2
Add equality testing for quantities
rprospero Jun 10, 2025
6b8adf4
More tests for quantities
rprospero Jun 10, 2025
1e51291
Fix meshmerge calculation
rprospero Jun 17, 2025
40a4d54
Start implementing ModellingRequirements
rprospero Jun 18, 2025
b97c6e4
Start testing modelling requirements
rprospero Jun 18, 2025
d096aa9
Start adding better tests for ModellingRequirements
rprospero Jun 18, 2025
f46cdf1
Flip order of parameters on compose
rprospero Jun 18, 2025
5a1d3e2
Don't assume that Sesans includes smear
rprospero Jun 18, 2025
a6f2b1c
Enable left composition by null model
rprospero Jun 18, 2025
b57e011
support right addition of NullModel
rprospero Jun 18, 2025
ad74973
Merge pull request #122 from SasView/quantity_test_fixes
rprospero Jun 20, 2025
729894e
Allow preprocess and postprocess steps
rprospero Jun 18, 2025
03f4a50
Start performing Hankel transform for SESANS
rprospero Jun 18, 2025
b10a60f
Ignore slit smearing before SESANS
rprospero Jun 20, 2025
7f64eab
Pull sesans metadata from file
rprospero Jun 20, 2025
17f67c7
Ruff format.
jamescrake-merani Jun 23, 2025
fd1e019
This comment was repetitive.
jamescrake-merani Jun 23, 2025
88a88a8
Ruff format.
jamescrake-merani Jun 24, 2025
7b20fed
Function to guess the dataset type.
jamescrake-merani Jun 24, 2025
92b18a2
Function for loading a file with default params.
jamescrake-merani Jun 24, 2025
2ba70cb
Ruff format.
jamescrake-merani Jun 24, 2025
29884ec
Remove imports ruff is complaining aren't used.
jamescrake-merani Jun 24, 2025
6d506ae
Added a test to make sure 2d data gets read right.
jamescrake-merani Jun 24, 2025
70a734b
Makes sure the dataset type gets guessed.
jamescrake-merani Jun 24, 2025
23e2688
Use basename for sasdata ascii name.
jamescrake-merani Jun 25, 2025
c73b6b9
Add unit test for Hankel transform
rprospero Jun 20, 2025
0b64135
Fix units with error calculation
rprospero Jun 20, 2025
8a67638
χ² squared based test
rprospero Jun 20, 2025
ccf340a
Code Review Suggestions
rprospero Jun 20, 2025
e0fa047
Fix up rename in compose
rprospero Jun 20, 2025
6c70ec3
Fix uncertainty in SESANS parser
rprospero Jul 1, 2025
248d632
Be consistent with how basename is called.
jamescrake-merani Jul 4, 2025
f90636f
Move comment to avoid odd Ruff format.
jamescrake-merani Jul 4, 2025
36d8ed6
Move comment again.
jamescrake-merani Jul 4, 2025
2f50b98
Fixed test.
jamescrake-merani Jul 4, 2025
f1c7cf6
Merge pull request #123 from SasView/refactor24_default_ascii_loading
jamescrake-merani Jul 4, 2025
cfb830c
Created an import metadata function.
jamescrake-merani Jul 7, 2025
fbb80b5
Remove the old function.
jamescrake-merani Jul 7, 2025
2bcb5d9
Use the new import function.
jamescrake-merani Jul 7, 2025
3ae5d9b
Remove these imports.
jamescrake-merani Jul 7, 2025
55a19d8
Need to use keywords for this dataclass.
jamescrake-merani Jul 7, 2025
14187de
Import the mumag data for testing.
jamescrake-merani Jul 7, 2025
e8aac4e
Added a case for mumag data.
jamescrake-merani Jul 7, 2025
bbc5fcb
Added test for loading ascii data with metadata.
jamescrake-merani Jul 7, 2025
3f9ff9f
Use params filenames not just filenames.
jamescrake-merani Jul 7, 2025
469ec03
Fixed column parameters.
jamescrake-merani Jul 7, 2025
dd0192f
Doh. Missing commas on filename list.
jamescrake-merani Jul 7, 2025
d5ef68e
Roll the raw metadata into the metadata object.
jamescrake-merani Jul 7, 2025
5a5ba0d
Need to fill in all the parameters.
jamescrake-merani Jul 7, 2025
009fc22
Forgot process :P
jamescrake-merani Jul 7, 2025
56dc4f2
Consider both of these.
jamescrake-merani Jul 7, 2025
c7c9414
Combine both metadata so we go through all of them
jamescrake-merani Jul 7, 2025
7b5e83a
Raw metadata is in lists.
jamescrake-merani Jul 7, 2025
6870d75
I don't know what these decimals were.
jamescrake-merani Jul 7, 2025
50b2b26
Merge pull request #125 from SasView/refactor24_ascii_use_raw_metadata
jamescrake-merani Jul 9, 2025
8f52345
Update SESANS units in unit_kinds
rprospero Jul 14, 2025
c242948
Merge pull request #124 from rprospero/ModellingRequirements
rprospero Jul 14, 2025
bde1738
Applies auto fixes for ruff rule F401
DrPaulSharp Jul 17, 2025
f70dd33
Applies auto fixes for ruff rule F841
DrPaulSharp Jul 17, 2025
ecb25c6
Applies auto fixes for ruff rule E714
DrPaulSharp Jul 17, 2025
f885fc7
Applies auto fixes for ruff rule F541
DrPaulSharp Jul 17, 2025
7c17464
Bumps minimum python version to 3.12
DrPaulSharp Jul 17, 2025
8c33099
Removes unnecessary test code
DrPaulSharp Jul 17, 2025
24f61cb
Merge pull request #127 from SasView/refactor_24_auto-fixes
jamescrake-merani Jul 17, 2025
e92a569
Test should fail when there's no data.
Jul 18, 2025
d478640
Added a get default unit function.
Jul 18, 2025
82bbda7
Use the new get default unit function.
Jul 18, 2025
6a2aadb
Pass in the unit group as well.
Jul 18, 2025
43d2de6
Return value if it isn't None.
Jul 18, 2025
e5aa7c9
Set the dataset type properly.
Jul 18, 2025
4022d9e
Expect 2D test to fail.
Jul 18, 2025
bf98d30
Merge pull request #135 from SasView/refactor24_fix_ascii_columns
jamescrake-merani Jul 21, 2025
19dd78f
Merge refactor_24 into sasview_database and resolve merge conflicts
krzywon Jul 25, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 54 additions & 0 deletions .github/workflows/test-fair-database.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
name: Tests

on:
[push, pull_request]

defaults:
run:
shell: bash

jobs:
unit-test:

runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [macos-latest, ubuntu-latest, windows-latest]
python-version: ['3.12']
fail-fast: false

steps:

- name: Obtain SasData source from git
uses: actions/checkout@v4

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
cache: 'pip'
cache-dependency-path: |
**/test.yml
**/requirements*.txt

### Installation of build-dependencies

- name: Install Python dependencies
run: |
python -m pip install --upgrade pip
python -m pip install wheel setuptools
python -m pip install -r requirements.txt
python -m pip install -r sasdata/fair_database/requirements.txt

### Build and test sasdata

- name: Build sasdata
run: |
# BUILD SASDATA
python -m pip install -e .

### Build documentation (if enabled)

- name: Test with Django tests
run: |
python sasdata/fair_database/manage.py test sasdata.fair_database
2 changes: 1 addition & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ jobs:
strategy:
matrix:
os: [macos-latest, ubuntu-latest, windows-latest]
python-version: ['3.10', '3.11', '3.12']
python-version: ['3.12']
fail-fast: false

env:
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
**/build
/dist
.mplconfig
**/db.sqlite3

# doc build
/docs/sphinx-docs/build
Expand Down
19 changes: 19 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
- id: trailing-whitespace
files: "sasdata/fair_database/.*"
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.9.2
hooks:
- id: ruff
args: [--fix, --exit-non-zero-on-fix]
files: "sasdata/fair_database/.*"
- id: ruff-format
files: "sasdata/fair_database/.*"
Comment on lines +10 to +14
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- repo: https://github.com/codespell-project/codespell
rev: v2.3.0
hooks:
- id: codespell
files: "sasdata/fair_database/.*"
10 changes: 10 additions & 0 deletions conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
import pytest

def pytest_addoption(parser):
parser.addoption(
"--show_plots", action="store_true", default=False, help="Display diagnostic plots during tests"
)

@pytest.fixture
def show_plots(request):
return request.config.getoption("--show_plots")
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ dynamic = [
]
description = "Sas Data Loader application"
readme = "README.md"
requires-python = ">=3.9"
requires-python = ">=3.12"
license = { text = "BSD-3-Clause" }
authors = [
{name = "SasView Team", email = "[email protected]"},
Expand Down
12 changes: 12 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,15 @@ lxml

# Calculation
numpy
scipy

# Unit testing
pytest
unittest-xml-reporting

# Documentation (future)
sphinx
html5lib

# Other stuff
matplotlib
149 changes: 149 additions & 0 deletions sasdata/ascii_reader_metadata.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
from dataclasses import dataclass, field
from typing import TypeVar
import re

initial_metadata = {
'source': ['name', 'radiation', 'type', 'probe_particle', 'beam_size_name', 'beam_size', 'beam_shape', 'wavelength', 'wavelength_min', 'wavelength_max', 'wavelength_spread'],
'detector': ['name', 'distance', 'offset', 'orientation', 'beam_center', 'pixel_size', 'slit_length'],
'aperture': ['name', 'type', 'size_name', 'size', 'distance'],
'collimation': ['name', 'lengths'],
'process': ['name', 'date', 'description', 'term', 'notes'],
'sample': ['name', 'sample_id', 'thickness', 'transmission', 'temperature', 'position', 'orientation', 'details'],
'transmission_spectrum': ['name', 'timestamp', 'transmission', 'transmission_deviation'],
'magnetic': ['demagnetizing_field', 'saturation_magnetization', 'applied_magnetic_field', 'counting_index'],
'other': ['title', 'run', 'definition']
}

CASING_REGEX = r'[A-Z][a-z]*'

# First item has the highest precedence.
SEPARATOR_PRECEDENCE = [
'_',
'-',
]
# If none of these characters exist in that string, use casing. See init_separator

T = TypeVar('T')

# TODO: There may be a better place for this.
pairings = {'I': 'dI', 'Q': 'dQ', 'Qx': 'dQx', 'Qy': 'dQy'}
pairing_error = {value: key for key, value in pairings.items()}
# Allows this to be bidirectional.
bidirectional_pairings = pairings | pairing_error

@dataclass
class AsciiMetadataCategory[T]:
values: dict[str, T] = field(default_factory=dict)

def default_categories() -> dict[str, AsciiMetadataCategory[str | int]]:
return {key: AsciiMetadataCategory() for key in initial_metadata.keys()}

@dataclass
class AsciiReaderMetadata:
# Key is the filename.
filename_specific_metadata: dict[str, dict[str, AsciiMetadataCategory[str]]] = field(default_factory=dict)
# True instead of str means use the casing to separate the filename.
filename_separator: dict[str, str | bool] = field(default_factory=dict)
master_metadata: dict[str, AsciiMetadataCategory[int]] = field(default_factory=default_categories)

def init_separator(self, filename: str):
separator = next(filter(lambda c: c in SEPARATOR_PRECEDENCE, filename), True)
self.filename_separator[filename] = separator

def filename_components(self, filename: str, cut_off_extension: bool = True, capture: bool = False) -> list[str]:
"""Split the filename into several components based on the current separator for that file."""
separator = self.filename_separator[filename]
# FIXME: This sort of string construction may be an issue. Might need an alternative.
base_str = '({})' if capture else '{}'
if isinstance(separator, str):
splitted = re.split(base_str.replace('{}', separator), filename)
else:
splitted = re.findall(base_str.replace('{}', CASING_REGEX), filename)
# If the last component has a file extensions, remove it.
last_component = splitted[-1]
if cut_off_extension and '.' in last_component:
pos = last_component.index('.')
last_component = last_component[:pos]
splitted[-1] = last_component
return splitted

def purge_unreachable(self, filename: str):
"""This is used when the separator has changed. If lets say we now have 2 components when there were 5 but the
3rd component was selected, this will now produce an index out of range exception. Thus we'll need to purge this
to stop exceptions from happening."""
components = self.filename_components(filename)
component_length = len(components)
# Converting to list as this mutates the dictionary as it goes through it.
for category_name, category in list(self.master_metadata.items()):
for key, value in list(category.values.items()):
if value >= component_length:
del self.master_metadata[category_name].values[key]

def all_file_metadata(self, filename: str) -> dict[str, AsciiMetadataCategory[str]]:
"""Return all of the metadata for known for the specified filename. This
will combin the master metadata specified for all files with the
metadata specific to that filename."""
file_metadata = self.filename_specific_metadata[filename]
components = self.filename_components(filename)
# The ordering here is important. If there are conflicts, the second dictionary will override the first one.
# Conflicts shouldn't really be happening anyway but if they do, we're gonna go with the master metadata taking
# precedence for now.
return_metadata: dict[str, AsciiMetadataCategory[str]] = {}
for category_name, category in (file_metadata | self.master_metadata).items():
combined_category_dict = category.values | self.master_metadata[category_name].values
new_category_dict: dict[str, str] = {}
for key, value in combined_category_dict.items():
if isinstance(value, str):
new_category_dict[key] = value
elif isinstance(value, int):
new_category_dict[key] = components[value]
else:
raise TypeError(f'Invalid value for {key} in {category_name}')
new_category = AsciiMetadataCategory(new_category_dict)
return_metadata[category_name] = new_category
return return_metadata
def get_metadata(self, category: str, value: str, filename: str, error_on_not_found=False) -> str | None:
"""Get a particular piece of metadata for the filename."""
components = self.filename_components(filename)

# We prioritise the master metadata.

# TODO: Assumes category in master_metadata exists. Is this a reasonable assumption? May need to make sure it is
# definitely in the dictionary.
if value in self.master_metadata[category].values:
index = self.master_metadata[category].values[value]
return components[index]
target_category = self.filename_specific_metadata[filename][category].values
if value in target_category:
return target_category[value]
if error_on_not_found:
raise ValueError('value does not exist in metadata.')
else:
return None

def update_metadata(self, category: str, key: str, filename: str, new_value: str | int):
"""Update the metadata for a filename. If the new_value is a string,
then this new metadata will be specific to that file. Otherwise, if
new_value is an integer, then this will represent the component of the
filename that this metadata applies to all."""
if isinstance(new_value, str):
self.filename_specific_metadata[filename][category].values[key] = new_value
# TODO: What about the master metadata? Until that's gone, that still takes precedence.
elif isinstance(new_value, int):
self.master_metadata[category].values[key] = new_value
else:
raise TypeError('Invalid type for new_value')

def clear_metadata(self, category: str, key: str, filename: str):
"""Remove any metadata recorded for a certain filename."""
category_obj = self.filename_specific_metadata[filename][category]
if key in category_obj.values:
del category_obj.values[key]
if key in self.master_metadata[category].values:
del self.master_metadata[category].values[key]

def add_file(self, new_filename: str):
"""Add a filename to the metadata, filling it with some default
categories."""
# TODO: Fix typing here. Pyright is showing errors.
self.filename_specific_metadata[new_filename] = default_categories()
3 changes: 3 additions & 0 deletions sasdata/checklist.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Things to check once everything is in place:

1) Do any centigrade fields read in incorrectly?
79 changes: 79 additions & 0 deletions sasdata/data.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@

import numpy as np

from sasdata import dataset_types
from sasdata.dataset_types import DatasetType
from sasdata.quantities.quantity import Quantity
from sasdata.metadata import Metadata


class SasData:
def __init__(self, name: str,
data_contents: dict[str, Quantity],
dataset_type: DatasetType,
metadata: Metadata,
verbose: bool=False):

self.name = name
# validate data contents
if not all([key in dataset_type.optional or key in dataset_type.required for key in data_contents.keys()]):
raise ValueError("Columns don't match the dataset type")
self._data_contents = data_contents
self._verbose = verbose

self.metadata = metadata

# TODO: Could this be optional?
self.dataset_type: DatasetType = dataset_type

# Components that need to be organised after creation
self.mask = None # TODO: fill out
self.model_requirements = None # TODO: fill out

# TODO: Handle the other data types.
@property
def ordinate(self) -> Quantity:
match self.dataset_type:
case dataset_types.one_dim | dataset_types.two_dim:
return self._data_contents["I"]
case dataset_types.sesans:
return self._data_contents["Depolarisation"]
case _:
return None

@property
def abscissae(self) -> Quantity:
match self.dataset_type:
case dataset_types.one_dim:
return self._data_contents['Q']
case dataset_types.two_dim:
# Type hinting is a bit lacking. Assume each part of the zip is a scalar value.
data_contents = np.array(list(zip(self._data_contents['Qx'].value, self._data_contents['Qy'].value)))
# Use this value to extract units etc. Assume they will be the same for Qy.
reference_data_content = self._data_contents['Qx']
# TODO: If this is a derived quantity then we are going to lose that
# information.
#
# TODO: Won't work when there's errors involved. On reflection, we
# probably want to avoid creating a new Quantity but at the moment I
# can't see a way around it.
return Quantity(data_contents, reference_data_content.units)
case dataset_types.sesans:
return self._data_contents["SpinEchoLength"]
case _:
None

def __getitem__(self, item: str):
return self._data_contents[item]

def summary(self, indent = " "):
s = f"{self.name}\n"

for data in sorted(self._data_contents, reverse=True):
s += f"{indent}{data}\n"

s += "Metadata:\n"
s += "\n"
s += self.metadata.summary()

return s
Loading
Loading