Bitter Gourd Resource Database

Identification of miRNAs

1. Gather known miRNAs and pre-miRNAs from miRBase

cd-hit -i hairpin.fa -o hairpin_nr.fa -c 0.9 -n 5
cd-hit -i mature.fa -o mature_nr.fa -c 0.9 -n 5

2. Align pre-miRNA sequences to the bitter gourd genome using BLASTn

blastn -query hairpin_nr.fa -db bitter_gourd_genome.fa -out pre_miRNA_alignments.txt -outfmt 6 -num_threads 4

3. Filter alignments

python filter_alignments.py pre_miRNA_alignments.txt

4. Fragment sequences into 200-nucleotide segments using SeqKit

seqkit sliding -s 25 -W 200 -o fragmented_sequences.fa filtered_sequences.fa

Identification of Circular RNAs (circRNAs)

1. Align high-quality clean reads to the bitter gourd genome using BWA

bwa mem -T 20 bitter_gourd_genome.fa reads.fq > aligned_reads.sam

2. Identify circRNAs using CIRI2

CIRI2.pl -T 4 -I aligned_reads.sam -O circRNAs.txt -F bitter_gourd_genome.fa -A annotation.gtf

Identification of Long Non-Coding RNAs (lncRNAs)

1. Map RNA-seq reads using HISAT2

hisat2 -x bitter_gourd_genome -1 reads_1.fq -2 reads_2.fq -S aligned_reads.sam

2. Assemble transcripts using StringTie

stringtie aligned_reads.sam -o assembled_transcripts.gtf

Identification of SNP and Indels

1. Quality check of raw reads

fastqc *.fastq.gz

2. Reads quality filtering

AdapterRemoval --file1 filename.fastq.gz --basename filename --minlength 30 --trimns --trimqualities --gzip

3. Alignment of pre-processed reads to the reference genome with BWA aln

bwa aln reference.fasta filename.fastq.gz > filename.sai

4. Variant calling

samtools mpileup -B -ugf reference.fasta filename.final.sort.rescaled.bam | bcftools call -vmO z - > filename.vcf.gz

Identification of SSR and Polymorphism

1. SSR Identification using MISA

perl misa.pl genome1.fa

Output: genome1.fa.misa (file containing all detected SSRs)

2. Extract Flanking regions using perl script

perl ssr.pl genome1.fa.misa

3. Prepare Primer3 Input Files

perl p3in.pl genome1

4. Design Primers using Primer3

primer3-2.3.6/src/primer3_core < reference.p3in > reference.p3out

5. Process Primer3 Output

perl p3out.pl reference.p3out reference.out primers

Output files:

  • reference.out (formatted primer details)
  • primers (final primer sequences in FASTA format)