This directory contains several R scripts designed to process and analyze data from various sessions of an experiment. The focus is on handling heart rate (HR) data, calculating its delta (Δ𝐻𝑅(𝑡𝑥 ) = 𝐻𝑅(𝑡𝑥 ) − 𝐻𝑅(𝑡0)), and processing Electrodermal Activity (EDA) and accelerometer (ACC) data. The scripts aim to synchronize all collected data, generate visualizations, and perform statistical analyses.
- Synchronizes Empatica E4 streams (HR, EDA, ACC) to experimental video timecodes using the first tag timestamp.
- Produces cleaned per-participant, per-session data in
cleaned_data/and diagnostic plots inplots/. - Provides ready-to-run statistical scripts for ΔHR, ACC, and questionnaire-based “connection” analyses.
- data_csv/ (you provide)
- E4 streams/ per-participant folders exported from E4 Manager
- participants.csv; semicolon-separated; includes demographics and per-session metadata
- videos timecodes.csv; semicolon-separated; per-participant timecodes and a reference Timestamp
- data_rds/ (generated)
- E4 streams/
data_<ID>.rdscontainingacc_data,eda_data,hr_data,tags_data participants.rds,videos_timecodes.rds
- E4 streams/
- cleaned_data/ (generated)
<ID>/HR|EDA|ACC/*_HR.rds,*_EDA.rds,*_ACC.rdsand PNG plots for A,B,C,D
- plots/ (generated)
- R scripts for processing and stats (see below)
- Place your raw inputs under
data_csv/:- Folder name exactly:
E4 streams(with a space) - Inside, one folder per participant, e.g.
P01_25_12_00h, each containing at least:ACC.csv(comma-separated)EDA.csv(semicolon-separated)HR.csv(semicolon-separated)tags.csv(semicolon-separated; first row has UNIX timestamp)
participants.csv(semicolon-separated). Must include columns:ID,Order,Familiarity,A,B,C,D,isPianist,Gender,A_diagram_before,A_diagram_after,B_diagram_before,B_diagram_after,Connection_A,Connection_B,A_diagram_var,B_diagram_varIDmust match the per-participant folder prefix, e.g.P01
videos timecodes.csv(semicolon-separated). Columns (no header in file):Participant,A_started,A_finished,B_started,B_finished,C_started,C_finished,D_started,D_finished,Timestamp- Times must be
hh:mm:ss Participantmust matchID
- Folder name exactly:
Install R (>= 4.1 recommended). Install required packages once:
install.packages(c(
"tidyverse", # includes dplyr, ggplot2, tidyr, tibble
"readxl", "openxlsx",
"signal", "pracma", "FSA", "ggsignif", "Hmisc",
"rstatix"
))Notes:
process_and_draw_graphs.Rinstallsdplyrandggplot2if missing.- Some statistical scripts require
readxl,openxlsx,rstatix.
- Prepare input folders/files as described in “Data expectations and format”.
- Convert CSVs to RDS (fast repeated loading):
- RStudio: open and Run
read_save_rds.R - Or command line:
Rscript read_save_rds.R
- RStudio: open and Run
- Build cleaned per-session datasets and plots:
- RStudio: run
process_and_draw_graphs.R - Or:
Rscript process_and_draw_graphs.R
- RStudio: run
- Run analyses of interest by executing the corresponding script(s) below.
All scripts assume the working directory is the repository root.
-
read_save_rds.R- Reads
data_csv/E4 streams/<ID_...>/ACC.csv|EDA.csv|HR.csv|tags.csv - Reads
data_csv/participants.csvanddata_csv/videos timecodes.csv - Writes
data_rds/E4_streams/data_<ID>.rds, plusparticipants.rds,videos_timecodes.rds
- Reads
-
process_and_draw_graphs.R- Synchronizes E4 streams to video using first tag timestamp and the
videos timecodesreference - Creates per-participant folders under
cleaned_data/<ID>/HR|EDA|ACC - Saves RDS slices per session (
A|B|C|D) and plots:- HR:
<session>_HR.rds,pulse_plot.png,delta_hr_plot.png - EDA:
<session>_EDA.rds,pulse_plot.png(with 0.1 μS guide) - ACC:
<session>_ACC.rds,ACC_plot.png(magnitude of x,y,z)
- HR:
- Synchronizes E4 streams to video using first tag timestamp and the
Time and timezone handling
- Raw UNIX times are converted then shifted by −4 hours in code. Adjust the offset in
process_and_draw_graphs.Rif your data/timezone differs. - All time strings must be
HH:MM:SS.
Configuration knobs (edit in process_and_draw_graphs.R)
sessions <- c("A","B","C","D")- Axis limits (tune for your dataset):
- EDA:
EDA_y_lim_min,EDA_y_lim_max - HR:
HR_y_lim_min,HR_y_lim_max - ΔHR:
HR_var_y_lim_min,HR_var_y_lim_max
- EDA:
-
Descriptive stats
stats_dhr.R: per-participant/session mean and SD of ΔHR (first 60 seconds)stats_acc.R: mean and SD of centered ACC magnitude (first 60×32 samples)stats_dhr_music_type.R: ΔHR summary for Calm (A,B) vs Dynamic (C,D)
-
Correlations and tests
spearman_acc_dhr.r: Spearman correlation ACC vs ΔHRspearman_connexion_dhr.R: Spearman correlation connection vs ΔHRfriedman_order_dhr.r,friedman_pianist_dhr.R: Friedman tests with post hoc Wilcoxonkruskal_wallis_familiarity_dhr.r,kruskal_wallis_familiarity_connection.Rmann_whitney_acc_piano_level.rshapiro_test_hr.R,shapiro_test_acc.Rwilcoxon_connection_diagram_dhr.R,wilcoxon_status_dhr.R
-
Visualization
box_plot_order_dhr.R: ΔHR by session orderbox_plot_pianist_dhr.R: ΔHR by pianistbox_plots_connection_diagrams.R: Connection boxes Before vs After, sessions A and B
Outputs
- Console summaries and, for some scripts, plots saved to
plots/ - Cleaned per-session RDS files under
cleaned_data/<ID>/...
- Missing folders/files
- Ensure exact names:
data_csv/E4 streams,participants.csv,videos timecodes.csv
- Ensure exact names:
- Delimiters
- ACC.csv must be comma-separated; EDA/HR/tags are semicolon-separated
- Participant IDs
participants.csv:IDandvideos timecodes.csv:Participantmust match the participant folder prefix (e.g.P01)
- Time formats
- All time fields must be
HH:MM:SS. Check for leading zeros.
- All time fields must be
- Timezone offset
- Adjust the −4h shift in
process_and_draw_graphs.Rif your local export differs.
- Adjust the −4h shift in
- Empty/mismatched sessions
- Scripts warn if session timecodes do not match stream times; verify start/end in
videos timecodes.csv
- Scripts warn if session timecodes do not match stream times; verify start/end in
- R (>= 4.1). RStudio optional but recommended.
- Install the packages listed in the Installation section above.