Skip to content

PAScall does not complete - issue in bedtools? #2

@hwessels

Description

@hwessels

Hi,

I am running PAScall on my Cellranger3.0 output using ensembl v97 annotation (all annotation is ensemble / no "chr" pasted in front of the chromosome names).
PAScall runs until it starts using bedtools.

environment:

module purge
module load scapture/1.0.1 # with python3 virtual environment
module load subread/2.0.3 # contains feature counts
module load kentutils/302.1 # contains genePredToBed and gtfToGenePred
module load bedtools/2.29.0

command:

scapture -m PAScall -a SCAPTURE_annotation -g genome.fa -b sample.bam  -l 98 -o name -p 20 --polyaDB SupTab_KnownPASs_fourDBs.txt &> PAScall.log

it generates the files name.genetype.count.txt and name.CallPasPerGene.sh. All subsequent files are empty.
Can you point me at what is happening?

log file excerpt:

scapture path: /nfs/sw/scapture/scapture-1.0.1/
DeepPASS model file: /nfs/sw/scapture/scapture-1.0.1//DeepPASS/best_model.h5
scapture module: PAScall
Output prefix: name
prefix of annotation files from annotation module: SCAPTURE_annotation
BAM file: sample.bam
Fragment length: 98
GENOME file: genome.fa
Peak width: 400
OverlapRatio: 0.5
threads: 20
poly(a) database file: SupTab_KnownPASs_fourDBs.txt
scapture PAScall: create command line. Wed Sep  1 12:15:39 EDT 2021
scapture PAScall: create command line done. Wed Sep  1 13:33:41 EDT 2021
scapture PAScall: peak calling. Wed Sep  1 13:33:41 EDT 2021
depth: invalid option -- 'd'
open: No such file or directory
/nfs/sw/scapture/scapture-1.0.1/scapture_callpeak: line 120: 142881 Done                    bedtools intersect -a $PREFIX"."$GeneName".bam" -b $PREFIX"."$GeneName".bed" -split -f 0.95 -u -bed
     142882 Broken pipe             | bedtools bedtobam -i - -bed12 -g $Chromsize
     142883 Segmentation fault      | samtools depth -d 0 - > $PREFIX"."$GeneName".cov"

One uncertainty I have is the identity of the SupTab_KnownPASs_fourDBs.txt file. I did not find it in you GitHub repository. Therefore, I generated it from a supplementary file from the paper as a 4 column tab delimited file with header. Not sure if this looks like it is supposed to look.

Location	Strand	Source	Overlapped number
chr1:16441-16442	-	PolyADB3	3
chr1:16442-16443	-	PolyA-Seq	3
chr1:16442-16452	-	PolyASite	3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions