Skip to content

Commit 7ef5fb3

Browse files
Create hes_apc_procedures.md
1 parent 98171eb commit 7ef5fb3

1 file changed

Lines changed: 43 additions & 0 deletions

File tree

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
---
2+
layout: default
3+
title: HES APC - Procedures
4+
parent: HES APC
5+
grand_parent: Curated Assets
6+
nav_order: 2
7+
permalink: /curated_assets/hes_apc/hes_apc_procedures
8+
---
9+
10+
# HES APC - Procedures
11+
12+
<a href="https://github.com/BHFDSC/hds_curated_assets/blob/main/D08-hes_apc.py" class="btn btn-primary fs-5 mb-4 mb-md-0 mr-2" target="_blank">View code on GitHub</a>
13+
14+
The *hes_apc_procedures* asset is curated from the latest archived version of the HES APC procedures table (hes_apc_otr_all_years_archive).The output is a long-format table where each row represents an individual three-digit or four-digit OPCS procedure code (opertn_01, …, opertn_nn) associated with a specific individual and hospital episode. Procedure codes are cleaned by removing non-alphanumeric characters (including punctuation and whitespace) and rows where codes are null, or an empty string, are removed ensuring only valid OPCS codes are retained. Both three-digit procedure codes and their corresponding four-digit variants (where available) are represented, ensuring consistency across code granularities.
15+
The resulting table includes 10 columns: 6 identifier columns (person ID, episode key, episode start date, episode end date, procedure date, admission date and discharge date) and 3 columns describing the procedure code and position:
16+
17+
- **code**: the OPCS procedure code
18+
- **code_digits**: indicates whether the procedure code is the three- or four-digit version
19+
- **position**: indicates the position of the procedure within the episode (eg., 1–n, corresponding to opertn_01, opertn_02, …)
20+
21+
22+
## Example
23+
24+
25+
26+
27+
The table is saved to the DSA schema **dsa_391419_j3w9t_collab**. The archived_on_date is in the format **YYYY_MM_DD**.
28+
29+
{: .highlight-title }
30+
> Table Name
31+
>
32+
> >
33+
> hds_curated_assets__hes_apc_procedure_archived_on_date
34+
35+
The below code will load the hes_apc_diagnosis table as at October 2024 using PySpark:
36+
37+
{% highlight markdown %}
38+
```python
39+
import pyspark.sql.functions as f
40+
dsa = f'dsa_391419_j3w9t_collab'
41+
hes_apc_procedure = spark.table(f'{dsa}.hds_curated_assets__hes_apc_procedure_2024_10_01')
42+
```
43+
{% endhighlight %}

0 commit comments

Comments
 (0)