"Command line parameters: -a CTGTAGGCACCATCAATAGATCGGAA -o 1-Cutadapted/flowcell362_lane4_pair1_Undetermined.fastq.gz --quiet flowcell362_lane4_pair1_Undetermined.fastq.gz\n",
"Trimming 1 adapter with at most 10.0% errors in single-end mode ...\n"
<divstyle="text-align:center">Call cutadapt binary file to strip strands</div>
<divstyle="text-align:center">Call cutadapt binary file to strip strands</div>
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
| Option | Effect |
| Option | Effect |
| -: | :- |
| -: | :- |
|-a ADAPTER | Enter adapter sequence |
|-a ADAPTER | Enter adapter sequence |
|-o OUTPUT | Indicate output file |
|-o OUTPUT | Indicate output file |
|--quiet | No long report |
|--quiet | No long report |
|INPUT | Enter input file |
|INPUT | Enter input file |
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
# Store current time
# Store current time
before=datetime.datetime.now()
before=datetime.datetime.now()
```
```
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
%%bash
%%bash
source./source
source./source
# Export adapter we want to cut
# Export adapter we want to cut
exportADAPTER=CTGTAGGCACCATCAATAGATCGGAA
exportADAPTER=CTGTAGGCACCATCAATAGATCGGAA
# Run binary
# Run binary
cutadapt \
cutadapt \
-a$ADAPTER \
-a$ADAPTER \
-o1-Cutadapted/$FILENAME.fastq.gz \
-o1-Cutadapted/$FILENAME.fastq.gz \
--quiet \
--quiet \
$FILENAME.fastq.gz
$FILENAME.fastq.gz
```
```
%% Output
This is cutadapt 1.9.1 with Python 3.5.1
Command line parameters: -a CTGTAGGCACCATCAATAGATCGGAA -o 1-Cutadapted/flowcell362_lane4_pair1_Undetermined.fastq.gz --quiet flowcell362_lane4_pair1_Undetermined.fastq.gz
Trimming 1 adapter with at most 10.0% errors in single-end mode ...
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
# Store current time
# Store current time
after=datetime.datetime.now()
after=datetime.datetime.now()
# Difference
# Difference
delta=after-before
delta=after-before
print("Cutadapt run time : {0}".format(delta))
print("Cutadapt run time : {0}".format(delta))
```
```
%% Output
Cutadapt run time : 0:03:24.043250
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
%%bash
%%bash
source./source
source./source
rm$FILENAME.fastq.gz
rm$FILENAME.fastq.gz
```
```
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
## Unzip resulting compressed fastq
## Unzip resulting compressed fastq
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
# Store current time
# Store current time
before=datetime.datetime.now()
before=datetime.datetime.now()
```
```
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
%%bash
%%bash
source./source
source./source
# Unzip compressed file and output it in mid-process directory
# Unzip compressed file and output it in mid-process directory
| --best | hits guaranteed best stratum; ties broken by quality |
| --best | hits guaranteed best stratum; ties broken by quality |
| INPUT | Input file |
| INPUT | Input file |
| OUTPUT | Output file |
| OUTPUT | Output file |
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
%%bash
%%bash
source./source
source./source
bowtie \
bowtie \
-S \
-S \
-v3 \
-v3 \
-p8 \
-p8 \
--time \
--time \
--best \
--best \
ref/2-Indexes/Yeast-Noncoding/Yeast-Noncoding \
ref/2-Indexes/Yeast-Noncoding/Yeast-Noncoding \
3-Filtered/$FILENAME.fastq \
3-Filtered/$FILENAME.fastq \
4-Bowtied/$FILENAME.sam
4-Bowtied/$FILENAME.sam
```
```
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
## Remove non-codant RNA
## Remove non-codant RNA
Original script burns 416 lines (0-415). Doing so strip the first non-header entry. Is it right ? Here I strip 415 exactly.
Original script burns 416 lines (0-415). Doing so strip the first non-header entry. Is it right ? Here I strip 415 exactly.
This block creates a generator that filters records according to the corresponding sam file with '4' in the field used in the original script. This method takes the bet that the first non-header line in the sam file matches the first line of the fastq. This is ugly. Don't do this. Need refact with browsing the sam file correctly.
This block creates a generator that filters records according to the corresponding sam file with '4' in the field used in the original script. This method takes the bet that the first non-header line in the sam file matches the first line of the fastq. This is ugly. Don't do this. Need refact with browsing the sam file correctly.
The '4' in the flag field of the SAM file means that the read has no reported alignment. In this case, every aligned read means a match with non-codant tRNA index. So we keep only "mismatches" as they represent all reads that don't match with non-codant, in other words the reads we are interested in.
The '4' in the flag field of the SAM file means that the read has no reported alignment. In this case, every aligned read means a match with non-codant tRNA index. So we keep only "mismatches" as they represent all reads that don't match with non-codant, in other words the reads we are interested in.
> Sum of all applicable flags. Flags relevant to Bowtie are:
> Sum of all applicable flags. Flags relevant to Bowtie are:
> * 1 The read is one of a pair
> * 1 The read is one of a pair
> * 2 The alignment is one end of a proper paired-end alignment
> * 2 The alignment is one end of a proper paired-end alignment
> * 4 The read has no reported alignments
> * 4 The read has no reported alignments
> * 8 The read is one of a pair and has no reported alignments
> * 8 The read is one of a pair and has no reported alignments
> * 16 The alignment is to the reverse reference strand
> * 16 The alignment is to the reverse reference strand
> * 32 The other mate in the paired-end alignment is aligned to the reverse reference strand
> * 32 The other mate in the paired-end alignment is aligned to the reverse reference strand
> * 64 The read is the first (#1) mate in a pair
> * 64 The read is the first (#1) mate in a pair
> * 128 The read is the second (#2) mate in a pair
> * 128 The read is the second (#2) mate in a pair
> Thus, an unpaired read that aligns to the reverse reference strand will have flag 16.
> Thus, an unpaired read that aligns to the reverse reference strand will have flag 16.
> A paired-end read that aligns and is the first mate in the pair will have flag 83 (= 64 + 16 + 2 + 1).
> A paired-end read that aligns and is the first mate in the pair will have flag 83 (= 64 + 16 + 2 + 1).