Process and Analyze Data

This directory contains several R scripts designed to process and analyze data from various sessions of an experiment. The focus is on handling heart rate (HR) data, calculating its delta (Δ𝐻𝑅(𝑡𝑥 ) = 𝐻𝑅(𝑡𝑥 ) − 𝐻𝑅(𝑡0)), and processing Electrodermal Activity (EDA) and accelerometer (ACC) data. The scripts aim to synchronize all collected data, generate visualizations, and perform statistical analyses.

What this repository does

Synchronizes Empatica E4 streams (HR, EDA, ACC) to experimental video timecodes using the first tag timestamp.
Produces cleaned per-participant, per-session data in cleaned_data/ and diagnostic plots in plots/.
Provides ready-to-run statistical scripts for ΔHR, ACC, and questionnaire-based “connection” analyses.

Repository structure

data_csv/ (you provide)
- E4 streams/ per-participant folders exported from E4 Manager
- participants.csv; semicolon-separated; includes demographics and per-session metadata
- videos timecodes.csv; semicolon-separated; per-participant timecodes and a reference Timestamp
data_rds/ (generated)
- E4 streams/ data_<ID>.rds containing acc_data, eda_data, hr_data, tags_data
- participants.rds, videos_timecodes.rds
cleaned_data/ (generated)
- <ID>/HR|EDA|ACC/*_HR.rds, *_EDA.rds, *_ACC.rds and PNG plots for A,B,C,D
plots/ (generated)
R scripts for processing and stats (see below)

Data expectations and format

Place your raw inputs under data_csv/:
- Folder name exactly: E4 streams (with a space)
- Inside, one folder per participant, e.g. P01_25_12_00h, each containing at least:
  - ACC.csv (comma-separated)
  - EDA.csv (semicolon-separated)
  - HR.csv (semicolon-separated)
  - tags.csv (semicolon-separated; first row has UNIX timestamp)
- participants.csv (semicolon-separated). Must include columns:
  - ID, Order, Familiarity, A, B, C, D, isPianist, Gender, A_diagram_before, A_diagram_after, B_diagram_before, B_diagram_after, Connection_A, Connection_B, A_diagram_var, B_diagram_var
  - ID must match the per-participant folder prefix, e.g. P01
- videos timecodes.csv (semicolon-separated). Columns (no header in file):
  - Participant, A_started, A_finished, B_started, B_finished, C_started, C_finished, D_started, D_finished, Timestamp
  - Times must be hh:mm:ss
  - Participant must match ID

Installation

Install R (>= 4.1 recommended). Install required packages once:

install.packages(c(
  "tidyverse",            # includes dplyr, ggplot2, tidyr, tibble
  "readxl", "openxlsx",
  "signal", "pracma", "FSA", "ggsignif", "Hmisc",
  "rstatix"
))

Notes:

process_and_draw_graphs.R installs dplyr and ggplot2 if missing.
Some statistical scripts require readxl, openxlsx, rstatix.

Quickstart

Prepare input folders/files as described in “Data expectations and format”.
Convert CSVs to RDS (fast repeated loading):
- RStudio: open and Run read_save_rds.R
- Or command line: Rscript read_save_rds.R
Build cleaned per-session datasets and plots:
- RStudio: run process_and_draw_graphs.R
- Or: Rscript process_and_draw_graphs.R
Run analyses of interest by executing the corresponding script(s) below.

All scripts assume the working directory is the repository root.

Processing pipeline

read_save_rds.R
- Reads data_csv/E4 streams/<ID_...>/ACC.csv|EDA.csv|HR.csv|tags.csv
- Reads data_csv/participants.csv and data_csv/videos timecodes.csv
- Writes data_rds/E4_streams/data_<ID>.rds, plus participants.rds, videos_timecodes.rds
process_and_draw_graphs.R
- Synchronizes E4 streams to video using first tag timestamp and the videos timecodes reference
- Creates per-participant folders under cleaned_data/<ID>/HR|EDA|ACC
- Saves RDS slices per session (A|B|C|D) and plots:
  - HR: <session>_HR.rds, pulse_plot.png, delta_hr_plot.png
  - EDA: <session>_EDA.rds, pulse_plot.png (with 0.1 μS guide)
  - ACC: <session>_ACC.rds, ACC_plot.png (magnitude of x,y,z)

Time and timezone handling

Raw UNIX times are converted then shifted by −4 hours in code. Adjust the offset in process_and_draw_graphs.R if your data/timezone differs.
All time strings must be HH:MM:SS.

Configuration knobs (edit in process_and_draw_graphs.R)

sessions <- c("A","B","C","D")
Axis limits (tune for your dataset):
- EDA: EDA_y_lim_min, EDA_y_lim_max
- HR: HR_y_lim_min, HR_y_lim_max
- ΔHR: HR_var_y_lim_min, HR_var_y_lim_max

Analysis scripts (run after processing)

Descriptive stats
- stats_dhr.R: per-participant/session mean and SD of ΔHR (first 60 seconds)
- stats_acc.R: mean and SD of centered ACC magnitude (first 60×32 samples)
- stats_dhr_music_type.R: ΔHR summary for Calm (A,B) vs Dynamic (C,D)
Correlations and tests
- spearman_acc_dhr.r: Spearman correlation ACC vs ΔHR
- spearman_connexion_dhr.R: Spearman correlation connection vs ΔHR
- friedman_order_dhr.r, friedman_pianist_dhr.R: Friedman tests with post hoc Wilcoxon
- kruskal_wallis_familiarity_dhr.r, kruskal_wallis_familiarity_connection.R
- mann_whitney_acc_piano_level.r
- shapiro_test_hr.R, shapiro_test_acc.R
- wilcoxon_connection_diagram_dhr.R, wilcoxon_status_dhr.R
Visualization
- box_plot_order_dhr.R: ΔHR by session order
- box_plot_pianist_dhr.R: ΔHR by pianist
- box_plots_connection_diagrams.R: Connection boxes Before vs After, sessions A and B

Outputs

Console summaries and, for some scripts, plots saved to plots/
Cleaned per-session RDS files under cleaned_data/<ID>/...

Troubleshooting

Missing folders/files
- Ensure exact names: data_csv/E4 streams, participants.csv, videos timecodes.csv
Delimiters
- ACC.csv must be comma-separated; EDA/HR/tags are semicolon-separated
Participant IDs
- participants.csv:ID and videos timecodes.csv:Participant must match the participant folder prefix (e.g. P01)
Time formats
- All time fields must be HH:MM:SS. Check for leading zeros.
Timezone offset
- Adjust the −4h shift in process_and_draw_graphs.R if your local export differs.
Empty/mismatched sessions
- Scripts warn if session timecodes do not match stream times; verify start/end in videos timecodes.csv

Prerequisites

R (>= 4.1). RStudio optional but recommended.
Install the packages listed in the Installation section above.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Process and Analyze Data

What this repository does

Repository structure

Data expectations and format

Installation

Quickstart

Processing pipeline

Analysis scripts (run after processing)

Troubleshooting

Prerequisites

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
data_rds		data_rds
.gitignore		.gitignore
README.md		README.md
box_plot_order_dhr.R		box_plot_order_dhr.R
box_plot_pianist_dhr.R		box_plot_pianist_dhr.R
box_plots_connection_diagrams.R		box_plots_connection_diagrams.R
center_ACC.R		center_ACC.R
friedman_order_dhr.r		friedman_order_dhr.r
friedman_pianist_dhr.R		friedman_pianist_dhr.R
kruskal_wallis_familiarity_connection.R		kruskal_wallis_familiarity_connection.R
kruskal_wallis_familiarity_dhr.r		kruskal_wallis_familiarity_dhr.r
mann_whitney_acc_piano_level.r		mann_whitney_acc_piano_level.r
process_and_draw_graphs.R		process_and_draw_graphs.R
read_save_rds.R		read_save_rds.R
shapiro_test_acc.R		shapiro_test_acc.R
shapiro_test_hr.R		shapiro_test_hr.R
spearman_acc_dhr.r		spearman_acc_dhr.r
spearman_connexion_dhr.R		spearman_connexion_dhr.R
stats_acc.R		stats_acc.R
stats_dhr.R		stats_dhr.R
stats_dhr_music_type.R		stats_dhr_music_type.R
wilcoxon_connection_diagram_dhr.R		wilcoxon_connection_diagram_dhr.R
wilcoxon_status_dhr.R		wilcoxon_status_dhr.R

Folders and files

Latest commit

History

Repository files navigation

Process and Analyze Data

What this repository does

Repository structure

Data expectations and format

Installation

Quickstart

Processing pipeline

Analysis scripts (run after processing)

Troubleshooting

Prerequisites

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages