Hello,
This is a very helpful tool for processing htseq count data, and I am using it to get the TPM for some RNAseq counts after running htseq. However, I am not sure how the average read length affects the output, as when I change -i to 1, to 5 or 100000000 the files produced are always exactly the same.
I am just producing variations of the following:
python tpm_table.py -n 20100900_E1D -c 20100900_E1D.count -i <(echo -e "20100900_E1D\t1") -l 647_dereplicated.genelengths | sort > 20100900_E1D_tpm.tsv
No error messages are produced unless I put the average read length as 0.
Maybe I have done something wrong? Or maybe the read length is not affecting low count numbers? The highest count for a gene_id I have is 603227 (with a corressponding TPM of 60025.2112), which is the same for files where I have specified an average read length of 1 or 10,000,000,000,000.
Thank you,
Caity
Hello,
This is a very helpful tool for processing htseq count data, and I am using it to get the TPM for some RNAseq counts after running htseq. However, I am not sure how the average read length affects the output, as when I change -i to 1, to 5 or 100000000 the files produced are always exactly the same.
I am just producing variations of the following:
python tpm_table.py -n 20100900_E1D -c 20100900_E1D.count -i <(echo -e "20100900_E1D\t1") -l 647_dereplicated.genelengths | sort > 20100900_E1D_tpm.tsv
No error messages are produced unless I put the average read length as 0.
Maybe I have done something wrong? Or maybe the read length is not affecting low count numbers? The highest count for a gene_id I have is 603227 (with a corressponding TPM of 60025.2112), which is the same for files where I have specified an average read length of 1 or 10,000,000,000,000.
Thank you,
Caity